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SCREENING FOR GENETIC VARIATION 

The invention relates to the detection of sequence 
differences between test and reference nucleic acids; 
that is, to means and methods for the detection of the 
existence in a test polynucleotide of a genetic defect, 
5 or variation, from a reference, typically wild- type, 
polynucleotide. The invention is useful in clinical, 
forensic, and research contexts. 

Background of the Invention 

10 

Methods known in the art for comparing nucleotide 
sequence differences in DNA molecules are reviewed in 
Cotton, R., 1989, Biochem. J. 263:1, and include those 
aimed at detecting sequence differences when the 
15 sequence and location of a given region of DNA are 
known, discovering previously unknown mutations in a 
known region of DNA, and locating a previously unknown 
region containing a mutation. 

20 Previous methods of detecting known sequence 

differences include: the failure of an oligonucleotide 
having a wild-type DNA sequence to hybridize under 
stringent conditions to sample DNA containing a 
mutation, the failure of PCR primers to hybridize under 

25 stringent conditions to sample DNA containing a 
mutation, and the consequent failure of sample DNA 
containing a mutation to become amplified using PCR; 
the failure of adjacent oligonucleotides to ligate due 
to a failure of one or both oligonucleotides to 

30 hybridize under stringent conditions to sample DNA 
containing a mutation; the use of primer extension 
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analysis to detect incorporation of differentially 
labeled bases where the primer hybridizes to the sample 
DNA adjacent to the mutation; and the detection of 
changes in cleavability of a restriction enzyme site as 
5 an indicator of the presence of a mutation. 

Previous methods of detecting a mutation of unknown 
identity within a known region of the genome include 
those in which a heteroduplex molecule is created from 

10 one strand of test DNA and one strand of reference DMA. 
Mismatches between the reference and test DNAs may be 
detected by carbodiimide modification of mismatched 
Thymidine (T) and Guanine (G) bases and detection of 
the resultant mobility shifts of modified versus 

15 control DNA; by ribonuelease cleavage of mismatched 
pyrimidine bases of RNA/DNA hybrids, and detection of 
points of cleavage in the molecule; by detection of 
differences in melting temperature between heteroduplex 
and homoduplex DNA, e.g., by denaturing gel 

20 electrophoresis; and chemical modification and cleavage 
of mismatched bases using hydroxylamine (to modify 
cytosine) or osmium tetroxide (to modify thymidine) 
modification and piperidine cleavage, and subsequent 
detection of cleaved DNA. Additional methods for 

25 detecting an unknown mutation within a region of DNA 
include: detecting differences in secondary structure 
by looking for differential mobility in gels of single 
stranded reference and test DNA; and by direct 
sequencing of both reference and test DNAs. 

30 

Several methods of locating mutations where both 
the identity and region of the mutation are described 
in the art. RFLP analysis, in which Restriction 
Fragment Length Polymorphisms are analyzed, identifies 



sequence differences which occur at restriction enzyme 
cleavage sites of test and reference DNAs, or by the 
insertion or deletion of a number of bases. RFMP 
analysis (Gray, 1992, Amer. J. Hum. Genet. 50:331) is a 
variation of RFLP analysis in which denaturing gradient 
gel electrophoresis is used to identify sequence 
variations both at and between restriction enzyme 
cleavage sites. 

The Southern Cross method, described in Potter and 
Dressier (1986, Gene 48:229), also depends upon 
sequence differences between test and reference DNAs 
that occur at sites of restriction enzyme cleavage. In 
this method, a reference DNA is digested with one or 
more restriction enzymes and analyzed by a modified 
Southern procedure. According to this modified 
Southern procedure, hybridization of two identical 
membranes, which are positioned at 90° angles with 
respect to each other, gives a signal that forms along 
a diagonal line of hybridization. In contrast, where 
test and reference membranes are hybridized at 90° 
angles, differences in restriction fragment patterns 
between the test and reference DNAs are indicated by 
off -diagonal signals* 

Finally, the differential genomic DNA cloning 
method depends upon the inability of dephosphorylated 
reference DNA in a reference/test DNA hybrid to ligate 
to dephosphorylated vector DNA. In this method, 
described in Yokata and Oihsi (1990, Proc. Nat. Aca. 
Sci. 87:6398), test and reference DNAs are digested 
separately with restriction enzymes, reference DNA is 
then dephosphorylated, and the two DNAs are combined at 
a ratio of 100/1 of reference to test DNA. The mixture 
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10 



15 



20 



25 



30 



is subjected to agarose gel electrophoresis, and the 
DNA is denatured and renatured in the gel, such that 
unique restriction fragments will likely self -anneal 
and non-unique fragments will likely reanneal with 
reference strands. Subsequent cloning of the 
reannealed fragments will favor reannealed test DNA 
clones, since the dephosphorylated reference DNA or 
reference/test hybrids will not be ligated to a 
dephosphorylated vector. 

DNA mispairing can occur in vivo and is recognized 
and corrected by repair proteins. Mismatch repair has 
been studied most intensively in E. coli . Salmonella 
typhimurium , and S. P nenmnni»»_ The MutS, MutH and 
MutL proteins of E^coli are involved in the repair of 
DNA mismatches, as is the product of the uvrD gene in 
E> COli ' heli <*se II. MutS appears to play a central 
role in mismatch correction. Besides the repair system 
directed by Dam-mediated methylation of d(GATC) sites 
MutS is also active in two other less efficient 
mismatch repair processes. One of these processes acts 
on symmetrically methylated DNA and may serve to repair 
mismatches produced during recombination. The other 
corrects cytosine (C, to Thymidine (T, transitions at 
the internal C of the Dcm methylase sequence d(CCA/TGG) 
or subsets thereof and also requires mutL + and dcm + . 

Mismatched base pairs can arise in vivo during 
homologous recombination of allelic genes, by chemical 
modification of DNA, or from errors made by DNA 
polymerase. Repair of mismatched DNA base pairs has 
been invoked to explain a variety of genetic phenomena, 
including gene conversion in Neurosoora spp. and other 
fungi (Mitchell, 1955, Proc. Nat. Aca. Sci. 41:215- 
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Rossignol, 1969, Genetics 63:795), postmeiotic 
segregation in Saccharomyces cerevisiae (Williamson et 
al., 1985, Genetics 110:609), high negative 
interference and gene conversion in lambda phage 
5 crosses (Nevers et al., 1975, Mol. Gen* Genet. 139:233; 
White et al., 1974, Proc. Nat. Aca. Sci. 71:1544; 
Wildenberg et al., 1975, Proc. Nat. Aca. Sci. 72:2202), 
and the existence of high and low efficiency 
transforming markers in Streptococcus pneumoniae 
10 (Ephrussi et al., 1966, J. Gen. Physiol. 49:211; Lacks, 
1966, Genetics 53:207). 

Jiricny et al. (1988, Nucl. Ac. Res. 16:7843) 
performed in vitro binding experiments using MutS and a 

15 series of synthetic DNA duplexes containing known 
mismatches or mismatch analogues of the 
purine/pyrimidine type in order to demonstrate that 
MutS binds in vitro to double-stranded DNA containing a 
mismatched nucleotide pair. Su et al. (1986, Proc. 

20 Nat. Aca. Sci. 83:5057) have shown that highly purified 
MutS binds to a purified 120 base pair restriction 
fragment containing a single mismatch in vitro and 
protects approximately 22 nucleotides surrounding the 
mismatch against DNase attack. Su et al. (1988, J. 

25 Biol. Chem. 263:6829) demonstrates that MutS recognizes 
all eight possible DNA base mismatches. 

McKay (1981, J. Mol. Biol. 145:471, hereby 
incorporated by reference), describes a method of 
30 purifying certain SV40 DNA restriction fragments using 
an immunoprecipitation procedure in which the SV40 T 
antigen- related protein binds to these DNA fragments. 
Blackwell and Weintraug (1990, Science 250:1104), 
hereby incorporated by reference, describes a method of 
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purifying DNA sequences that bind to a protein of 
interest based on amplification of a binding site. The 
protein of interest is bound to DNA fragments and the 
bound fragment (s) is isolated using an electrophoretic 
5 mobility shift assay. 

Objects of the invention include methods for rapid 
and accurate genetic screening and diagnosis by 
comparing two nucleic acids for differences in their 

0 nucleotide sequences. Another object is to diagnose 
genetic diseases in mammals, especially humans, by 
rapid screening for a previously observed mutation(s) 
known to cause a genetic disease. Another object is to 
rapidly screen the genome of an individual for genetic 

5 variation of a specific region of DNA, where the nature 
and position of the variation is unknown, by comparing 
a nucleic acid sequence known to reflect normal gene 
function with a nucleic acid sample suspected to 
contain a genetic defect. Yet another object is to 

) locate previously unknown mutations of a nucleotide 
sequence and to identify the sequence itself, where the 
nature and position of the mutation within a region of 
the genome is unknown, and where the location of the 
region itself is unknown. 



Summary of the Invention 



The invention provides methods of detecting and/or 
identifying polynucleotide sequence differences which 
may be the basis for genetic disease. The method 
involves hybridizing a "test", i.e., a potential 
variant, nucleic acid, e.g., from a patient, with a 
nucleic acid standard. If the test and standard 
(reference) nucleic acids contain one or more 
nucleotide sequence differences, then the double 
stranded nucleic acid formed from hybridization of the 
sequences will contain one or more nucleotide pair 
mismatches, i.e., will comprise a heteroduplex. In 
accordance with the invention, protocols are provided 
which permit detection of the presence of the 
heteroduplex, and/or segregation of a fraction rich in 
heteroduplex. The detection and fractionation methods 
involve exploitation of the selective binding 
properties of mismatch binding proteins. 

The invention encompasses methods which allow for 
detection of differences between nucleotide sequences 
with greatly increased sensitivity. The methods of the 
invention allow one to detect single or multiple 
nucleotide differences between a nucleic acid standard 
and a sample nucleic acid without relying on 
restriction fragment length differences. The invention 
also provides for enrichment of heteroduplex fragments 
containing mismatches, even in a sample containing 
excess homoduplex, thereby achieving more sensitive 
detection of the mismatch. The methods also may be 
used quantitatively to determine the fraction of 
heteroduplex fragments in a mixture, and the proportion 
of mismatch binding protein bound to heteroduplex, and 
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thus also may be used to determine the number of 
mismatches within a test sample. The methods also 
allow for recovery of nucleic acid fragments containing 
sequence mismatches from a mixture containing excess 
5 fully complementary fragments. Recovered fragments may 
be analyzed further, for example, to determine the 
identity and position of the mismatch by determining 
the nucleotide sequence of the mismatch region. 

10 In a first aspect, the invention features methods 

of genetic screening for a nucleotide variation which 
generally include the following steps . A mixture of 
nucleic acids which includes heteroduplex nucleic 
acids, i.e., heteroduplex including a test nucleic acid 

15 strand hybridized with a reference nucleic acid strand 
generated by annealing test and reference nucleic acid, 
and which includes a mismatched nucleotide pair, is 
subjected to a mismatch binding protein under 
conditions which promote binding of the protein to 

20 heteroduplex in the mixture to form a 

heteroduplex/binding protein complex. The presence of 
the mismatched nucleotide pair then is detected, using 
the methods disclosed below, as an indication of the 
presence of genetic variation between the test and 

25 reference nucleic acids. 

In preferred embodiments of this aspect of the 
invention, the mixture provided may be a complex 
mixture of different nucleic acid fragments, some of 
30 which are heteroduplex fragments, but many or a 

majority of which are homoduplex nucleic acids. The 
test nucleic acid may be isolated from a collection of 
organisms and may include nucleic acid from any tissue 
or cell of several members of a species. 
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Alternatively , the test nucleic acid may be sampled 
from an individual and thus may comprise nucleic acid 
from one unique representative of a species. In 
addition , the test nucleic acid may be suspected, but 
5 not known , to contain a nucleotide variation from a 
wild- type sequence which encodes a normal , functional 
protein or regulatory element. A nucleotide variation 
in the test nucleic acid comprises one half of a 
mismatched nucleotide pair when the test nucleic acid 
10 is hybridized to the reference nucleic acid. 

The mixture of nucleic acids provided in the method 
typically are generated by annealing the test and 
reference nucleic acids. The test nucleic acid may be 

IS produced by cleaving double stranded test nucleic acid 
into a fragment which spans the same nucleotide 
region(s) as the reference nucleic acid(s). Both the 
test and reference nucleic acids may be either single 
or double stranded. If either is double stranded, the 

20 test mixture must be "melted" , i.e., denatured to 
produce single stranded polynucleotide, before 
annealing. Generally, the test and the reference 
nucleic acids may be genomic DNA, cDNA, mRNA, synthetic 
polynucleotide, mitochondrial DNA, amplified or 

25 circular DNA, or other single or double stranded 
polynucleotide, from whatever source. While it is 
preferable that the reference nucleic acid be single 
stranded, it also may be double stranded. 

30 The annealed mixture of test and reference nucleic 

acids will include a concentration of heteroduplexes if 
this test nucleic acid embodies at least one base 
difference from the reference. The heteroduplexes 
present in this mixture may be fractionated from the 
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mixture by affinity purification in which a mismatch 
binding protein binds to the heteroduplexes 
preferentially to the homoduplexes in the mixture. The 
bound heteroduplexes may then be recovered from the 
affinity purification, e.g., released, to produce a 
fraction which contains a higher concentration of 
heteroduplex. 

The methods of genetic screening also may include 
the immobilization of reference nucleic acid to a solid 
support. For example, reference nucleic acids may be 
immobilized to a solid surface in an array of plural, 
spaced-apart spots. The spots of reference nucleic 
acid are then exposed separately under hybridizing 
conditions to a test nucleic acid such that the test 
and immobilized reference nucleic acids are able to 
form a hybrid. The hybrids then are contacted with the 
mismatch binding protein under conditions sufficient to 
allow the binding protein to bind to a heteroduplex 
containing a mismatched nucleotide pair. Finally, the 
bound mismatch binding protein, or the 
heteroduplex/protein complex, is detected as an 
indication of genetic variation between the test sample 
and the reference nucleic acid at that spot. 

Detection of the heteroduplex may be conducted by 
detecting the mismatch binding protein that is bound to 
the heteroduplex, e.g., using a labeled form of the 
mismatch binding protein or a separate binding protein 
such as an antibody specific for the mismatch binding 
protein. Alternatively, the heteroduplex may be 
detected by detecting the complex, e.g., with an 
antibody specific for an epitope on the 
heteroduplex/mismatch binding protein complex. 
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Alternatively , the bound mismatch binding protein or 
bound heteroduplex may be released from the complex 
before detection of the released component. 
Alternatively , the mismatch binding protein may modify 
5 the heteroduplex before it releases, and the 
modification may be subsequently detected. The 
heteroduplex itself can include a detectable moiety, 
e.g., a radioactive or other label bound to the 
reference nucleic acid, and the detecting step can 

10 include detecting the detectable moiety after 

fractionation of the heteroduplex. The methods may 
also include, in addition to detecting the presence of 
a mismatched nucleotide pair, determining the identity 
or location of the nucleotide variation in the test 

15 strand. The identity or location of the nucleotide 

variation may be determined by analyzing the nucleotide 
sequence of the test nucleic acid strand and comparing 
it to the sequence of the reference strand. 

20 in a second aspect, the invention features methods 

of selectively enriching a nucleic acid preparation in 
fragments containing a nucleotide variation, by 
enriching for heteroduplex nucleic acids in a mixture. 
Selective heteroduplex enrichment of a mixture which 

25 includes a first concentration of heteroduplex nucleic 
acids may be performed by separating the heteroduplex 
nucleic acids by affinity purification in which the 
mismatch binding protein binds to heteroduplex, and 
recovering heteroduplex to produce a mixture that 

30 contains a second, higher concentration of 

heteroduplex. As a variation on this method, the 
mixture first is reacted with a mismatch binding 
protein such that the heteroduplex binds to the protein 
to form a heteroduplex protein complex, and then the 
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complex is separated from the mixture by affinity 
purification to produce a mixture having a higher 
concentration of heteroduplex. In both variations of 
this aspect of the invention, the affinity purification 
5 step involves a binding reaction in which the 

heteroduplex is selectively bound by a mismatch binding 
protein which preferably is Coupled to a solid support, 
followed by elution. The binding and elution steps may 
be repeated interactively until a desired degree of 
10 purification of heteroduplexes is achieved. Numerous 
modifications of this general procedure are encompassed 
by the invention. For example, the mismatch binding 
protein/heteroduplex complex may be bound by 1) a 
protein specific for one or both components of the 
15 complex, e.g., an antibody, 2) a metal column capable 
of binding to a histidine tail engineered onto the 
mismatch binding protein, or 3) a protein capable of 
binding to a flag sequence on the mismatch binding 
protein. A solid support may not be preferable; e.g., 
an antibody may be used to immunoprecipitate the 
mismatch binding protein/heteroduplex complex. 



20 



In both aspects of the invention, the test nucleic 
acids may be prepared by, for example, performing a 
25 polymerase chain reaction on a region of interest in 
test nucleic acid sample, in addition, an 
amplification step, e.g., by polymerase chain reaction, 
may be useful at other points of the methods, e.g., 
after affinity purification of heteroduplex nucleic 
acids to produce an amplified heteroduplex sample. 
Where a PCR step is performed, it may be necessary to 
ligate PCR tails to the test, reference, or 
heteroduplex nucleic acids prior to the mismatch 
binding protein binding reaction. 



30 
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In both aspects of the invention , when the 
reference nucleic acid is labeled/ the methods may 
include the additional step of adding excess unlabeled 
5 nucleic acid to the mixture of test and reference 
nucleic acids to serve as a competitor to mismatch 
binding protein binding , thereby to reduce background. 
Background may be caused by the nonspecific binding of 
mismatch binding protein to homoduplex nucleic acid. 

10 In this case, detection of labeled reference nucleic 
acid does not correlate directly with the amount of 
heteroduplex present, even though purification was 
conducted with mismatch binding protein because of non- 
specific interactions between the mismatch binding 

15 protein and homoduplex nucleic acid. However, the 
presence of unlabeled competitor creates a dilution 
effect on labeled homoduplex nucleic acid, formed by 
annealing of reference/reference strands or test/test 
strands, which otherwise would be mistaken for 

20 heteroduplex. Alternatively, background may be reduced 
using an amplification step. PCR tails are ligated to 
the test and reference nucleic acids but not to the 
competitor nucleic acid. Excess competitor is added to 
the mixture prior to binding of mismatch binding 

25 protein. The subsequent amplification of presumed 
heteroduplex nucleic acid purified from the complex 
also will result in amplification of nonspecif ically 
bound homoduplex nucleic acid. However, the presence 
of excess competitor nucleic acid lacking PCR tails 

30 will dilute out the effect of nonspecific binding 

because nonspecif ically bound competitor nucleic acid 
will not be amplifiable. 



WO 93/22457 



PCT/US93/03777 



- 14 - 



In another aspect, the invention features apparatus 
for conducting comparisons of the sequence of test and 
reference nucleic acid, and for determining the 
existence or nature of a difference between two or more 
5 nucleic acid sequences. Broadly, these apparatus 
include, as essential elements, a mismatch binding 
protein, and either or both means for detecting the 
presence of the protein or a protein/heteroduplex 
complex, and/or means for separating heteroduplex from 
10 homoduplex in a mixture. 

A kit for detecting a heteroduplex nucleic acid as 
an indication of genetic variation may include an array 
of separately spaced reference nucleic acids coupled to 

15 a support, and a mismatch binding protein. Preferably, 
the mismatch binding protein is labeled, but 
alternatively, the kit may include a protein that binds 
the mismatch binding protein, e.g., a labeled protein 
such as an antibody or an unlabeled antibody that is 

20 bound by a labeled antibody. The protein capable of 
binding the mismatch binding protein may be immobilized 
on a solid support. 



A detection kit may also include a mismatch binding 
25 protein immobilized on a solid support, and means for 
detecting a heteroduplex bound to the support through 
the protein, or eluted from the support. 

The invention also features a kit for separating a 
30 heteroduplex nucleic acid from a mixture of 

heteroduplex and homoduplex nucleic acids, which 
includes a mismatch binding protein, a moiety capable 
of binding a mismatch binding protein, or a moiety 
capable of binding a complex comprising a mismatch 
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binding protein and a heteroduplex , all coupled to a 
solid support, and means for separating the 
heteroduplex from homoduplex. Any of the kits may 
include a reference nucleic acid. 

5 

In still another aspect, the invention features a 
solid support, e.g., an affinity matrix for binding 
heteroduplex nucleic acids. The support comprises a 
mismatch binding protein coupled to a high surface area 
10 matrix. Alternatively, the support may comprise 
immobilized moieties which bind a mismatch binding 
protein, or bind a heteroduplex/mismatch binding 
protein complex. 

15 As used herein, a "mismatch binding protein" refers 

to any organic moiety, e.g., a protein, polypeptide, 
organic analog thereon, or other moiety or mixture of 
moieties, which bind preferentially to regions of 
double-stranded nucleic acids containing a mismatch. 

20 The mismatched regions may be as little as one 

nucleotide pair and may be as large as 5-10 nucleotide 
pairs, e.g., a small loop region. Such binding 
proteins include but are not limited to naturally 
occurring proteins, such as MutS, MutL, MutH, and MutU 

25 (helicase II) from E. coli and Salmonella typhimurium , 
HexA and HexB from S^ pneumonaie , and mismatch binding 
proteins found in higher organisms, including humans 
(Jiricny et al., 1988, Proc. Nat. Aca. Sci. USA 
85:8860; Stephenson et al., 1989, J. Biol. Chem. 

30 264:21177), and analogs thereof which contain amino 
acid differences that do not destroy binding of the 
protein to the mismatched nucleotides, but may have 
properties not present in conventional mismatch binding 
protein, e.g., thermostability. As used herein, 
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"mismatch binding protein" also includes proteins which 
do not naturally bind a nucleotide mismatch, but which 
has been altered or engineered to bind a nucleic acid 
fragment containing mismatched nucleotides, and 
5 muteins, derivatives, truncated analogs, or species 
variants of naturally occurring mismatch binding 
proteins. The definition also includes an antibody or 
a mixture of antibodies that recognizes and binds 
heteroduplex nucleic acids. Also included in the 
10 invention are mismatch binding proteins that modify 
nucleic acids containing mismatches, thus allowing the 
nucleic acid to be subsequently recognized by other 
proteins or means. 

15 As used herein, "homoduplex" refers to double 

stranded nucleic acid containing first and second 
strands which are fully complementary. "Heteroduplex" 
refers to double stranded nucleic acid containing first 
and second strands which are substantially 

20 complementary, but which contains regions of 
noncomplementary, i.e., one or more mismatched 
nucleotide pairs. Regions of noncomplementarity may 
cause small loops to form within one strand of the 
heteroduplex. There may be as few as one region of 

25 noncomplementary per heteroduplex, or many regions, so 
long as the heteroduplex can form a stable hybrid under 
conditions selected to form the hybrid. A non- 
complementary region may include insertions or 
deletions of one or more bases of one strand relative 

30 to the other strand. "Competitor" nucleic acid refers 
to homoduplex nucleic acid that is either unlabeled or 
does not contain PCR tails, or that is distinguishable 
from heteroduplex nucleic acid. "Excess homoduplex" 
nucleic acid refers to a mixture containing at least 
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two-fold, preferably at least five- or ten-fold , and 
most preferably at least 100-fold more homoduplex 
nucleic acid than heteroduplex nucleic acid, where the 
excess homoduplex nucleic acid is a natural by-product 
5 of the process that created the heteroduplex nucleic 
acid. "Excess competitor" nucleic acid refers to a 
mixture containing at least two-fold, preferably at 
least five- or ten-fold, most preferably at least 100- 
fold more competitor homoduplex-nucleic acid than 

10 heteroduplex nucleic acid. "Nucleic acid" refers to 
DNA or RNA containing naturally occurring nucleotides 
or synthetic substitutions thereof. "Test" nucleic 
acid refers to single- or double- stranded DNA or RNA to 
be compared to the nucleic acid standard, e.g., DNA 

15 from a patient suspect of having a genetic disease. 

"Reference nucleic acid" refers to a single or double- 
stranded nucleic acid standard, e.g., a nucleic acid 
encoding a normal protein or regulatory function. 
"Mismatched nucleotide pair" refers to a nucleotide 

20 pair which does not match according to Watson/Crick 
base pairing, i.e., is not G:C, A:T, or A:U. A 
"nucleotide variation" is a nucleotide sequence 
difference between a test nucleic acid and a reference 
nucleic acid, and constitutes as little as one base 

25 pair of a mismatched nucleotide pair. "Amplify" means 
to make multiple copies of a nucleic acid fragment or a 
mixture of nucleic acids. "PCR" means polymerase chain 
reaction, and "PCR tail" refers to oligonucleotide 
duplexes which are ligated to the ends of nucleic acids 

30 and which, upon denaturation, may hybridize to 

complementary primers used to prime the synthesis of 
DNA. "Labeled" means containing a detectable moiety or 
a moiety which participates in a reactions resulting in 
detection, e.g., a chromogenic reaction. A detectable 
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moiety may, include but is not limited to a radioactive 
marker, e.g., 32 p, and non-radioactive markers , e.g., 
biotin. "Affinity purification" or "affinity 
fractionation" means to separate heteroduplex or 
5 heteroduplex/binding protein complex from other 

components based on the affinity of the heteroduplex or 
complex. An "affinity matrix" is a solid support which 
is used to affinity purify heteroduplex or 
heteroduplex/binding protein complex. 

10 

As used herein, a nucleic acid "isolated from an 
organism" refers to DNA or RNA that has been extracted 
directly from cells or tissue of one or more members of 
a species, e.g., procaryotic, eukaryotic, or mammalian, 

15 especially human DNA or RNA from human cells or tissue; 
or to DNA that has been cloned from genomic DNA or from 
RNA sequences; or to DNA that has been amplified from 
an organism's DNA using the technique of polymerase 
chain reaction. Nucleic acid "native to an individual" 

20 refers to DNA or RNA that has been extract from, cloned 
from, or amplified from cells or tissue of a member of 
a species. Where a nucleic acid is "suspected to 
contain" a nucleotide variation, it is not known 
whether the nucleic acid contains the variation prior 

25 to performing the method of the invention. 

Other features and advantages of the invention will 
be apparent from the following description of the 
preferred embodiments, from the drawing, and from the 
30 claims. 
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Detailed Description of the Invention 

We first briefly describe the drawings • 
Drawings 

5 Figure 1 schematically illustrates a method of 

detecting nucleic acid sequence mismatches; 

Figure 2 schematically illustrates a method for 
performing genetic disease diagnosis using a method of 
the invention in which the reference nucleic acid is 
10 labeled or detected using other means; 

Figure 3 schematically illustrates a method of 
affinity purifying heteroduplex nucleic acid molecules 
using a mismatch binding protein; 

Figure 4 schematically illustrates heteroduplex 
15 affinity purification in which heteroduplex mismatch 
binding protein complexes are fractionated; 

Figure 5 schematically illustrates a method of 
detecting nucleic acid sequence mismatches using an 
array of plural, separate reference nucleic acids 
20 arranged on a solid support; 

Figure 6 schematically illustrates a method of 
detecting nucleic acid sequence mismatches using a band 
shift assay; 

Figure 7 illustrates the results of a band shift 
25 assay; and 

Figure 8 schematically illustrates a method of 
differentially cloning nucleic acids sequences 
containing sequence variations. 

Figure 9 is a polyacrylamide gel showing the 
30 results of purification of histidine-tagged MutS. 

Figure 10 schematically illustrates a method of 
differentially analysing test/reference nucleic acid 
hybrids containing a mismatch. 

We next describe preferred embodiments of the 
35 invention. 
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!• Preparation of Nucleic Acids 

Test or reference nucleic acids can be prepared 
using a variety of techniques. For example, nucleic 
acid can be extracted from cells and used directly, or 
a specific region of extracted nucleic acid may be 
amplified; alternatively, nucleic acid may be 
synthesized. 



Cultured cells, tissue or blood samples may be used 
as a source or as the source of a nucleic acid 
sequence. Cultured monoclonal cell lines will give a 
single type of test nucleic acid, and cultured 
polyclonal cell lines can be used to check for 
15 differences between one standard nucleic acid and a 

library of nucleic acids containing many different test 
DNAs . Either chromosomal and/or extra-chromosomal dna, 
such as plasmid DNA, can be isolated for use as test or 
reference nucleic acid. 

20 

Nucleic acid can be extracted from cells, 
purified, and digested with restriction enzyme(s) to 
create nucleic acid fragments, and also may be 
subsequently amplified. The polymerase chain reaction 

25 (PCR) can be used to amplify a given region of nucleic 
acid in order to limit the scope of inquiry to this 
region, by choosing appropriate primers that f lank the 
region of interest. In addition, multiple primers can 
be used at once to amplify a set of regions of interest 

30 for simultaneous comparison. 

Test or reference nucleic acid may also be prepared 
from synthetic DNA. DNA can be synthesized, and one or 
more oligonucleotides may be used as a test or 
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reference nucleic acid. Oligonucleotides are 
particularly useful as reference nucleic acid for 
moderate size regions* 

5 A test or reference nucleic acid may also include a 

mixture of two or more of cellular DNA, amplified DNA, 
and/or synthetic DNA, for simultaneous comparison of 
different nucleic acid loci. 

10 1. Representational Difference Analysis. 

If desired, a nucleic acid sample may be treated so 
as to reduce the complexity of the sample by removing 
irrelevant or unnecessary nucleic acid sequences, e.g., 
using representational difference analysis, 

15 subtractive hybridization or kinetic enrichment 
(Kinzler et al., Nucleic Acid Research 17, 10:3645 
1989); Lisitsyn et al., Science 259:956 (1993), both 
references of which are hereby incorporated by 
reference). The complexity of a nucleic acid sample 

20 may be decreased significantly by preparing a 
representative portion of each of the test and 
reference nucleic acid samples, or of the denatured and 
reannealed test/reference sample, as described by 
Lisitsyn et al., supra . Nucleic acid populations of 

25 reduced complexity, i.e., "representations", allow for 
detection of nucleotide sequences differences between 
two complex genomes. One method of creating a 
representative portion of a nucleic acid sample is to 
selectively amplify certain fragments relative to 

30 others. Tor example, test or reference nucleic acid is 
first cleaved into restriction fragments, and then PGR 
tails are ligated onto the ends of the fragments. If 
the restriction sites chosen for cleavage occur 
infrequently, then the average restriction fragment 
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size will be large* Upon amplification of the tailed 
fragments using PGR primers that are complementary to 
the tail sequences, the smaller fragments of the 
mixture will be selectively amplified. Thus, a 
5 representative nucleic acid sample is created which 
contains the relevant sequences but is significantly 
less complex than the original nucleic acid sample. 
Subsequent reiterations of the method will further 
enrich the sample for relevant sequences. 

10 

Test or reference nucleic acids also may have 
identical primer sequences incorporated at their ends 
to permit the later amplification of the heteroduplex 
nucleic acid; for example, PCR tails may be added onto 
15 the ends of, e.g., the n A M and "B" samples in Fig. 1, 
prior to step 1, and PCR amplification may be performed 
at a later step in the procedure. 

2. Differential PCR Tailing. 

20 pgr also can be used so as to allow subsequent 

amplification of only test-reference hybrids, and thus 
reduce the frequency of test-test and/or reference- 
reference hybrids in the sample. Fig. 10 schematically 
illustrates this method. In this method, a first PCR 

25 tail is ligated onto the the 5' end of test nucleic 
acid ("A" nucleic acid in Fig. 10) and a second PCR 
tail is ligated onto the 3' end of reference nucleic 
acid ( W B W nucleic acid in Fig. 10). It will be 
appreciated that complete or partial digestion by 

30 multiple restriction enzymes yields non-symmetric 5' 
and 3 r ends suitable for differential PCR tail 
ligation. Of course, the first PCR tail may be ligated 
onto reference nucleic acid and the second PCR tail may 
be ligated onto test nucleic acid. According to this 
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method of the invention , only test-reference hybrids 
will undergo exponential amplification. This method is 
described in detail below. 

5 3. Differential Strand Labeling. 

Test and reference nucleic acids may also be 
differentially labeled to allow their progress to be 
traced through the comparison process. For example, a 
test nucleic acid can be left unlabeled and the 

10 reference nucleic acid (or another test nucleic acid) 
can be, for example, end-labeled with 32 P by a kinasing 
reaction. Any appropriate labeling method may be used; 
e.g., to permit detection of radioactively-labeled 
nucleic acid or chromogenic or chemiluminescent 

15 detection of, for example, a biotin labeled nucleic 

acid. In addition, determining the presence or absence 
of specific nucleic acid sequences may be achieved by 
differential detection, e.g., using different PCR 
primer sequences which are sequence specific for the 

20 fragments of interest. The subsequent selection of 
corresponding primer oligonucleotides for use in the 
PCR amplification reaction, followed by analysis of the 
amplified nucleic acid, will give amplification of the 
selected nucleic acid. 

25 

II. Preparation of Heteroduplexes and Homoduplexes 

Heteroduplex nucleic acid includes double stranded 
nucleic acids in which the molecules contain one strand 
30 each from the test and reference nucleic acids. If the 
test and reference nucleic acids contain differences, 
annealing of test and reference strands will create 
heteroduplex molecules. Where the test and reference 
nucleic acids are completely homologous or the test and 
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reference strands anneal as test/test or 
reference/reference hybrids , a homoduplex will be 
created. The heteroduplex molecule forms despite the 
mismatch because the remainder of the matched base 
5 pairs stabilizes the heteroduplex molecule. Thus, 

heteroduplex molecules are formed by fragments that are 
similar enough to anneal but that contain mismatches. 

The degree of similarity necessary for a 
10 heteroduplex to be formed can be controlled by the 
stringency of the annealing conditions. For example, 
if the annealing reaction is run at an elevated 
temperature, single stranded molecules will need to 
have increased sequence similarities before they can 
15 form heteroduplexes . Conditions for annealing of 

nucleic acids to form hybrids are well-known in the art 
or, if unknown, can be determined by routine 
experimentation. See, for example, Alt et al. (1978, 
J. Biol. Chem. 253: 1357 , hereby incorporated by 
20 reference). 

A standard method of denaturing and reannealing 
nucleic acids which may be used to prepare 
heteroduplexes according to the invention is the 

25 following. The test nucleic acid is suspended in 100 
ul of Ix SSC buffer (0.15M NaCl, 0.015M Nacitrate) in 
an eppendorf tube. The tube is placed in a beaker of 
water, and the beaker of water is placed in a boiling 
water bath until the water in the beaker boils. After 

30 ten minutes of boiling, the beaker is removed from the 
water bath, and allowed to cool to 65° C, and placed in 
a 65°C water bath. The 65° C water bath is switched 
off. The nucleic acid is allowed to anneal during 
cooling of the 65° C water bath to room temperature. 

35 The nucleic acid can then be ethanol precipitated and 
resuspended in TE buffer. 
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III. Identification of Heteroduplex Fragments 

Figs. 1-6 and 8 schematically illustrate methods 
for the detection and/or analysis of genetic 
5 differences according to the invention. Fig. 7 shows 
the results of one such identification. 



In Fig. 1, a method of detecting a nucleotide pair 
mismatch is shown schematically. In step 1, test and 

10 reference nucleic acids (samples A and B, respectively , 
each sample containing two different nucleic acid 
fragments , 1 and 2, respectively ) , are denatured and 
reannealed such that single stranded molecules from 
sample A nucleic acid and sample B nucleic acid 

15 reanneal to form duplexes. Fragment 2 in each of the 
test and reference samples is identical (i.e., contains 
no mismatches ), and forms a homoduplex after the 
reannealing process. In contrast, fragment 1A differs 
from fragment IB by only a single base pair mismatch. 

20 When a single strand of fragment 1A reanneals with a 
single strand of fragment IB, a heteroduplex nucleic 
acid molecule forms ("1A/1B W in the figure) containing 
a mismatched base pair. This is shown schematically in 
Fig. 1 as the mixture of denatured and reannealed 

25 fragments between steps 1 and 2. Fragments 1A/1B and 
1B/1A each contain a nucleotide pair mismatch, whereas 
fragments labeled "lA/lA", "IB/IB", and tt 2" are fully 
complementary. The mixture of fragments is then 
subjected to a binding reaction in which the mismatch 

30 binding protein is allowed to bind to fragments 
containing mismatches. The results of the binding 
reaction are shown schematically in step 2 of Fig. 1, 
in which the protein is shown bound to each of 
fragments "lA/lB," and "1B/1A" containing mismatches. 
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In step 3/ the mismatches are detected and/or 
quant it ated. Examples of detection and quantitation of 
nucleotide pair mismatches are disclosed herein* 
Optional steps in the method shown in Fig. 1 and in 
5 other figures include the addition of competitor 

nucleic acid prior to binding of the mismatch binding 
protein to reduce nonspecific binding to matched 
nucleic acid, and thus reduce background; and the 
amplification of a sample containing heteroduplex 
10 nucleic acid at some step prior to detection or 

quantitation. These optional steps are discussed more 
fully below. 

In Fig. 2, a quantitative method of genetic disease 

15 diagnosis according to the invention is schematically 
shown. Patient nucleic acid is prepared according to 
conventional techniques and cleaved into restriction 
fragments. The nucleic acid standard, to which the 
patient nucleic acid is to be compared, contains 

20 "normal" nucleic acid fragments, i.e., nucleic acid 

fragments having a sequence known to reflect the normal 
gene functions. In this example, either the nucleic 
acid standard is labeled or the mismatch binding 
protein is labeled. The two nucleic acid samples are 

25 then subjected to any one of the methods of the 

invention, including those illustrated in the figures. 
This step is referred to as "Nucleic Acid Comparison" 
in Fig. 2. The results of the nucleic acid comparison, 
i.e, the detection or isolation of hybrid nucleic acid 

30 fragments of patient/standard nucleic acid containing 
one or more nucleotide pair mismatches, may be 
subjected to quantitative analysis by quantitating the 
data present in both input and output samples . 
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In Fig. 3, a method of selectively enriching for 
nucleic acid hybrids containing mismatches is shown. 
In this figure, the affinity purification step involves 
the selectively sequestering of heteroduplex nucleic 
5 acid using a mismatch binding protein. Step 1 of Fig. 
3 is similar to step 1 of Fig. 1, and involves the 
denaturation and annealing of a test and a reference 
nucleic acid sample (A and B, respectively). The 
mixture of annealed nucleic acid is shown, as in Fig. 

10 1. The annealed mixture is then subjected to an 

affinity purification reaction in which heteroduplex 
nucleic acid is bound by a mismatch binding protein 
under appropriate binding conditions, as described 
herein. The affinity purification reaction may be an 

15 immunoprecipitation reaction in which the mismatch 

binding protein is allowed to bind to the nucleic acid, 
followed by immunoprecipitation using an antibody, as 
described below. Alternatively , the affinity 
purification reaction may include subjecting the 

20 annealed mixture to mismatch binding protein coupled to 
beads, e.g., in a free slurry or poured into a column 
matrix. The bound heteroduplex nucleic acid will 
become sequestered with the beads and will thus be 
separable from the unbound nucleic acid. After 

25 separation, the bound nucleic acid is eluted or 

released ( Step 3 ) . The mismatch binding protein may be 
attached to any solid support that will permit the 
separation of free nucleic acid from nucleic acid bound 
by the mismatch binding protein. 

30 

Affinity purification of heteroduplex nucleic acid 
may involve any of a number of affinity purification 
techniques, and is not limited to that discussed above. 
For example, as shown in Fig. 4, the affinity step may 
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involve selectively sequestering of the entire 
heteroduplex/mismatch binding protein complex, rather 
than just the heteroduplex nucleic acid itself. Steps 
1 and 2 of Fig. 4 are similar to steps 1 and 2 of Fig. 
5 1, in which the annealed mixture is formed and 
subjected to a binding reaction in which mismatch 
binding protein binds to heteroduplex nucleic acid in 
the mixture. In step 3, the heteroduplex/binding 
protein complexes are selectively retained, e.g., by a 
10 matrix to which an antibody specific for the binding 
protein is coupled. The complexes may then be eluted 
(step 4), followed by isolation of the heteroduplex 
nucleic acid (step 5), e.g., by phenol extraction of 
protein and ethanol precipitation of nucleic acid. 

15 

Fig. 5 shows an alternative method of genetic 
disease screening and diagnosis in which nucleotide 
pair mismatches are detected in a simple assay. This 
method is a specific embodiment of that shown in Fig. 

20 1, and involves a solid support in which quantities of 
reference nucleic acid are spotted onto a membrane in 
an ordered pattern. The standard (reference) and the 
patient (test) nucleic acids are then denatured and 
annealed according to conventional techniques. After 

25 the hybrids are allowed to form, the membrane is 
subjected to a binding reaction in which mismatch 
binding protein is allowed to bind to any 
heteroduplexes which may have formed. After unbound 
mismatch binding protein is washed off the membrane, 

30 the presence of bound mismatch binding protein is 
detected using any appropriate detection technique 
disclosed herein or known in the art. 
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An alternative to fixing the reference nucleic acid 
on a solid support is to fix the test nucleic acid on a 
solid support* The technique outlined in Fig. 5 can be 
applied to this alternative method, with the 
5 modification that reference nucleic acid is annealed to 
the fixed test nucleic acid. Methods of fixing test 
nucleic acid to a solid support include crosslinking, 
alkaline transfer to a membrane, or other techniques, 
as described in Ausubel et al., eds., 1992, current 

10 protocols in Molecular Biology, John Wiley & Sons, NY, 
also herein incorporated by reference. Alternatively, 
in situ hybridization, also as described in Ausubel, 
can be used to directly anneal reference nucleic acid 
to test nucleic acid that is contained in sectioned 

15 cells. Annealing can be optionally performed in the 
presence of competitor nucleic acid. 

Another alternative method of genetic disease 
screening or diagnosis involves the detection of 

20 nucleotide pair mismatches using a band shift assay. 
Fig. 6 illustrates this method. In steps 1 and 2, the 
patient (test) nucleic acid is denatured and annealed 
to reference nucleic acid and allowed to bind to 
mismatch binding protein, as described in Fig. 1. The 

25 bound nucleic acid is then electrophoresed on an 
agarose gel. This method takes advantage of the 
decreased mobility of bound heteroduplexes relative to 
unbound hybrids in agarose. As shown schematically in 
Fig. 6, the control lane (left), in which the annealed 

30 fragments were not subjected to mismatch binding 

protein, contains only homoduplex fragment 2 (top) and 
1A/1A, IB/IB, or unbound heteroduplex 1A/1B or 1B/1A 
(bottom), whereas the experimental lane (right) 
contains both homoduplex bands (top and bottom) and the 
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middle heteroduplex band (lA/lB or 1B/1A). The results 
of such an assay are shown in Fig. 7. Mismatch binding 
protein was allowed to bind under binding conditions to 
a mixture of nucleic acid fragments, and then subjected 
5 to agarose gel electrophoresis. The mobility of the 
nucleic acid fragment in the mixture that contained a 
nucleotide pair mismatch is near the top of the gel 
(lane 2) and thus was selectively slowed relative to 
the faster running unbound nucleic acid fragments, 

10 which migrated to the bottom of the gel. The control 
lanes in Fig. 7 (lane 1 and 3) show that when no 
mismatch binding protein is added to the binding 
reaction, there is no binding to fragments and 
consequently no fragments migrating with the bound 

15 fragments in the gel. 

A genetic disease may be not only detected, but 
also further analyzed to learn more about the genetic 
cause of the disease using the mismatch detection and 

20 isolation methods of the invention. Such analysis may 
include determining the nucleotide sequence of the 
strands of the isolated heteroduplex nucleic acid, or 
may involve the cloning of that portion of the 
patient's nucleic acid that contains the nucleotide 

25 sequence difference. Fig. 8 schematically illustrates 
a method differential cloning of heteroduplex strands. 
Test nucleic acid includes heteroduplex nucleic acid 
from samples A and B as shown in Figs . 3 or 4 . This 
nucleic acid was prepared by annealing a patient and a 

30 standard nucleic acid and purifying the heteroduplexes 
bound by the mismatch binding protein to produce 
mixture 1 in the figure. Reference nucleic acid in 
Fig. 8 is prepared from mixtures 1 and 2. Mixture 2 is 
prepared by denaturing and annealing sample A with 
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itself and purifying heteroduplexes bound by mismatch 
binding protein* Similarly, mixture 3 is prepared by 
denaturing and annealing sample B with itself and 
purifying heteroduplexes bound by mismatch binding 
5 protein. Mixtures 2 and 3 are then pooled without 
denaturing and reannealing again to produce the 
reference nucleic acid. The test A/B and reference A/A 
and B/B nucleic acids are then subjected to the 
differential cloning method described below* This 
10 method produces clones of A and B nucleic acids that 
were part of a A/B heteroduplex. 

IV. MutS Binding Reaction 

15 The mismatch binding protein MutS from Salmonella 

typhimurium selectively binds mismatches in 
heteroduplex molecules. MutS also binds mismatches 
that include deleted or added bases. Additional 
mismatch binding factors, such as MutL, can also be 

20 used in the binding reaction as an alternative to or in 
combination with MutS, to increase binding. MutS 
protein can be purified using the MutS overproducer 
plasmid pGWl825 (Haber et al., 1988, J. Bacteriol. 
170:197) and the method of Su and Modrich (1986, Proc. 

25 Nat. Aca. Sci. 83:5057). MutL has been cloned into 
plasmid pGWl 842 (Mankovich et al., 1989, J. Bacteriol. 
171:5325), and can be purified using the method of 
Griley et al. (1989, J. Biol. Chem. 264:1000). Haber 
et al., 1988, Su et al. 1986, Griley et al., 1989, and 

30 Mankovich et al. 1989 are all hereby incorporated by 
reference. 

The mismatch binding protein/heteroduplex binding 
reaction is typically performed as follows. The 
35 reaction is performed in assay buffer (20 mM Tris.HCl 
pH 7.6, 5 mM MgCl 2 , 0.1 mM DTT, and 0.01 mM EDTA) for 
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30 minutes on ice. Typical binding reactions are 10 ul 
total volume, with 0*2 pmol of duplex DNA and 40 pmol 
of mismatch binding protein, e.g., MutS. The addition 
of ATP to the binding reaction may increase the 
5 efficiency of binding of the protein or of cof actors 
such as MutL. 

In addition to selectively binding heteroduplex 
nucleic acid, MutS nonspecif ically binds to homoduplex 

10 nucleic acid to some degree. In order to reduce 
nonspecific binding, competitor (i.e., homoduplex) 
nucleic acid may be added to the heteroduplex mixture 
prior to the binding reaction or the affinity 
fractionation step, as shown in Figs. 1, 3, and 4. 

15 Where the test or reference nucleic acid is labeled, as 
shown in Fig. 2, the use of excess unlabeled competitor 
DNA will cause most non-specific binding to occur on 
unlabeled nucleic acid, as is more fully described 
below. Thus, the effect of non-specific interactions 

20 will be minimized if the label is used to follow the 
progress of the fractionation. Competitor nucleic acid 
is also useful in the amplification process. Starting 
nucleic acid can be prepared with PCR tails to permit 
amplification, as shown in Fig. 1, step 2. If 

25 competitor nucleic acid lacking these PCR tails is 

added to the mixture prior to amplification, the effect 
of non-specific interactions will be minimized on PCR 
amplified heteroduplex nucleic acid because competitor 
nucleic acid that appears in the heteroduplex mixture 

30 will not be amplif ied. 

V. Detection of Nucleotide Pair Mismatches 



35 



The detection of heteroduplex nucleic acid 
according to the invention is accomplished using a 
binding assay in which one or more mismatch binding 
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protein(s) bind to a nucleotide mismatch to form a 
nucleic acid/protein complex which is subsequently 
detected. 



5 For diagnosis of a genetic disease where the 

mutation that causes the disease is known, the 
invention provides methods which enable detection of 
the presence of heteroduplexes between patient and 
reference nucleic acids. The invention utilizes known 

10 methods of nucleic acid hybridization to form duplexes 
of test and references strands, and provides inventive 
methods for the sensitive detection of even a single 
base pair mismatch in a heteroduplex. Thus, a genetic 
disease, one example of which is sickle-cell anemia, 

15 which involves the substitution of a thymine for an 
adenine at position 17 of the gene sequence encoding 
the beta chain of hemoglobin, is easily diagnosed by 
the mismatch detection methods of the invention, as 
described below. Other diseases involving genetic 

20 mutations which are diagnosable according to the 

invention include the following. For example, Tajima 
et al. (Jour. Biochem. 105:249, 1989) disclose a gGAG 
-> AAG base change which leads to a Glu -> Lys am ino 
acid substitution and results in apolipoprotein E 

25 (ApoE) deficiency; Hirshhorn et al. (Jour. Clin. 

Invest. 83:487, 1989) describe a mutation which leads 
to adenosine deaminase (ADA) deficiency, i.e., a single 
base change (CCG -> CAG) leading to a Pro -> Gin amino 
acid substitution; Jagadees et al. (Jour. Cell. Biol. 

30 Suppl. 13E;291, 1989) describe mutations at seven 
different locations within the FX gene, GAT -> AAT 
resulting in an Asp -> Asn substitution at position 58, 
GTG -> ATG resulting in a Val -> Met substitution at 
position 68, GCC -> ACC resulting in a Glu -> Lys 
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substitution at position 156, TCC -> TTC resulting in a 
Ser -> phe substitution at position 188, GCC -> ACC 
resulting in an Ala -> Thr substitution at position 
335, and GGG -> AGG resulting in a Gly -> Arg 
5 substitution at position 447, each mutation of which 
results in a Factor X deficiency; Ginsburg et al. 
(Proc. Nat. Aca. Sci. 86:3723, 1989) describes two 
mutations, GTC -> GAC and CGG -> TGG resulting in Val 
-> Asp and Arg -> Trp substitutions at positions 844 
10 and 834, respectively, each of which produces a 

defective von Willebrand Factor 2a; Matsuura et al. 
(Jour. Biol. Chem. 264:10148, 1989) describe a mutation 
which leads to adenylate kinase deficiency (CGG -> TGG) 
leading to an Arg -> Trp amino acid substitution; 
15 Dilella et al. (Nature 327:333, 1987) describes a 

mutation within the PAH gene, tCGG -> TGG resulting in 
an Arg -> Trp substitution at position 408, which 
produces the condition known as phenylketonuria; Bock 
et al. (Biochem. 27:6171, 1988) disclose a CCT -> CTT 
20 single base change which leads to a Pro -> Leu amino 
acid substitution and results in antithrombin III 
deficiency; Ohno et al. (Jour. Neurochem. 50:316, 1988) 
reports on a CGC -> CAC mutation resulting in an Arg -> 
His substitution at codon 178 of the HezB gene which 
25 produces Tay-Sachs disease; Gibbs et al. (Proc. Nat. 
Aca. Sci. 86:1919, 1989) discloses mutations at seven 
different codons of the HPRT gene, TCT -> TTA resulting 
in a Phe -> Leu substitution at position 73, TTG -> TCG 
resulting in a Leu -> Ser substitution at position 130, 
30 GCA ~> TCA resulting in an Ala -> Ser substitution at 
position 160, CGA -> TCA resulting in premature 
termination of translation at position 169, TTC -> GTC 
resulting in a Phe -> Val substitution at position 198, 
CAT -> GAT resulting in a His -> Asp substitution at 



position 203 , and TGT -> TAT resulting in a Cys -> Tyr 
substitution at position 205 , each mutation of which 
results in HPRT deficiency; and Vulliamy et al. (Proc. 
Nat. Aca. Sci. 85:5171, 1988) discloses mutations at 
seven different positions within the G6PDH gene, GAT -> 
AAT resulting in an Asp -> Asn substitution at position 
58, GTG -> ATG resulting in a Val -> Met substitution 
at position 68, AAT -> GAT resulting in an Asn -> Asp 
substitution at position 126, GAG -> AAG resulting in a 
Glu -> Lys substitution at position 156, TCC -> TTC 
resulting in a Ser -> Phe substitution at position 188, 
GCC -> ACC resulting in an Ala -> Thr substitution at 
position 335, and GGG -> AGG resulting in a Gly -> Arg 
substitution at position 447, each mutation of which 
produces a condition known as G6PDH deficiency. 

A spot detection assay may be used to detect 
mismatches, as shown in Fig. 5 and described above. 
This method allows for the detection of genetic 
differences between a nucleic acid standard (a 
reference nucleic acid) and a number of test nucleic 
acids. Any number of conventional detection methods 
well-known to those skilled in the art may be used; 
e.g., direct detection of, e.g., labeled mismatched 
binding protein, detection of a fluorescent antibody 
capable of binding the mismatch binding protein, or 
detection of an antibody conjugated to an enzyme that 
reacts with a chromogenic substrate. 

Also included in the invention are detection 
methods based on the use of modified nucleic acid and 
proteins capable of binding the modified nucleic acid. 
For example, a modified base may occur as part of a 
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mismatched nucleotide pair, and a mismatch binding 
protein capable of binding to the mismatched pair 
containing the modified base may be used for detection. 

5 A band shift assay may also be used to detect bound 

heteroduplex nucleic acid according to the invention, 
as described above for Figs. 6 and 7. 

Other detection methods useful in the invention are 
10 illustrated by way of Fig. 1. Heteroduplexes are 
formed in step 1 and allowed to bind to mismatch 
binding protein in step 2. The heteroduplex/mismatch 
binding protein complexes may then be separated from 
free nucleic acid by immunoprecipitating the complexes 
15 with an antibody specific for the mismatch binding 
protein in step 3, e.g., using the method of McKay 
(supra). MutS polyclonal antibodies can be prepared 
according to conventional antibody preparation 
procedures using the following procedure. 

20 

Purified MutS is electrophoresed on an 8% 
polyacrylamide gel. After soaking in water 10 min. to 
remove the SDS, the gel is stained for 10 min in 0.1% 
coomassie blue in water, and then destained in water. 

15 The MutS band is cut out, chopped up into fine pieces 
with a razor blade. 1 ml of PBS (137 mM NaCl, 2.7 mM 
KC1, 4.3 mM Ma 2 7H 2 0, 1.4 mM KH 2 P0 4 , pH 7.3) is added, 
and the mixture is ground up further by passage through 
progressively smaller syringes. Rabbits are injected 

10 with 500 fig of a mixture of fractions containing the 
MutS protein. Protein for boosts is prepared in the 
same way, except that Freunds incomplete adjuvant is 
used. The rabbits are boosted twice with 100 pg of the 
MutS fractions, and bled to obtain serum. 
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The serum is pre- absorbed and used in 
immunoblotting according to the protocols of Harlow and 
Lane (1988, "Antibodies, A Laboratory Manual," Cold 
5 Spring Harbor Press, CSH, NY), hereby incorporated by 
reference. 

After the imraunoprecipitation step, heteroduplex 
nucleic acid fragments may be optionally isolated for 
10 further analysis by performing a phenol extraction to 
remove the binding protein and anti-binding protein 
antibody. 

Alternatively, other means of detecting bound 
15 mismatch binding protein may be used; e.g., the 

mismatch binding protein itself may be labeled or one 
strand of the heteroduplex nucleic acid may be labeled 
and followed into bound nucleic acid, also as described 
herein. Additional detection techniques are described 
20 below as procedures for fractionation; e.g., a mismatch 
binding protein binding column which binds to mismatch 
binding protein by virtue of a sequence in the binding 
protein which is recognized by a moiety on the column. 

25 VI. Affinity Fractionation of Heteroduplexes 

The invention also provides for selective 
enrichment of heteroduplexes within a sample by 
affinity fractionation of fragments containing 
30 mismatches, thereby achieving more sensitive detection 
of the mismatch (es) . 



35 



The proportion of heteroduplexes in a sample may be 
substantially increased using affinity fractionation, 
as shown schematically in Fig. 3. The mixture 
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10 



30 



containing heteroduplexes is subjected to affinity 
purification, in which the heteroduplexes are bound to 
and subsequently eluted from a solid support to which 
mismatch binding protein is coupled. In Fig. 4, the 
heteroduplex/mismatch binding protein complexes are 
selectively retained by a matrix to which any moiety is 
coupled which can bind the complex, e.g., a binding 
protein specific- or complex specific-antibody. 



In addition to antibody supports in which the 
antibody binds directly to the mismatch binding protein 
or the nucleic acid/mismatch binding protein complex, 
other affinity supports may be used. For example, one 
can take advantage of the ability of a metal, e.g., 
15 nickel, column to bind to histidine residues in a 
polypeptide using immobilized metal affinity 
chromatography. A histidine tail, e.g., six histidine 
residues, may be covalently linked to the amino 
terminus of the mismatch binding protein, as described 
20 by Hochuli et al. (Nov. 1988, Biotechnology, p. 1321, 
hereby incorporated by reference). When the 
heteroduplex/binding protein complex is applied to a 
nickel column, the histidine portion of the binding 
protein will be bound by the column. This procedure is 
25 also described in Holuchi et al.( ibid ). 

A histidine-tagged MutS protein may be prepared 
according to the following procedure. This procedure 
describes the preparation of a His-MutS protein in 
which six histidine residues have been added to the 
amino terminus of the MutS protein, of course, other 
His-MutS proteins may be prepared; for example, any 
desired number of histidine residues may be added to 
the amino terminus of the MutS protein, provided the 
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resultant His-tagged MutS protein retains its 
biological activity in binding mismatched nucleic acid 
and is retainable on a nickel column. If desired, the 
His-MutS protein can be purified further using a 20 mM 
5 - 120 mM phosphate gradient on a hydroxyapatite column 
or on other protein purification known in the art. 

Briefly, six histidine residues may be added to the 
amino terminus of the MutS protein. The MutS gene may 

10 be PCR amplified from plasmid DNA containing the gene 
using PCR primers which anneal to each end of the gene 
and prime DNA replication. The amplified DMA is then 
digested with restriction endonucleases to generate a 
restriction fragment containing MutS-encoding DNA. The 

15 MutS-encoding restriction fragment is then cloned into 
a polylinker site of a plasmid which allows for 
expression of the inserted DNA by placing the inserted 
DNA under control of a promoter. Preferably, this 
promoter is controllable so that MutS gene expression 

20 is initiated at a desired point in the cell cycle; 

e.g., the inducible E.coli lac promoter is useful in an 
E.coli host. The MutS-encoding clone is then 
transformed into an appropriate host strain, and a 
clone is isolated containing MutS-encoding DNA. 

25 

The MutS-encoding clone is grown under conditions 
which do not allow for expression of the MutS gene 
until a desired optical density of the cell culture is 
reached. The culture is then induced to produce His- 
30 MutS, and the cells grown until they are harvested. 
The cells are then centrifuged, and the pellets are 
frozen at -80°C until ready for use. MutS protein is 
then purified from the cell pellet as follows. The 
cell pellet is thawed on ice and resuspended in lysis 
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buffer (20 mM KP04 pH 7.4, 10 mM betamercaptoethanol , 
0.5 M KC1, lmM PMSF, 200 pg/ml lysozyme). The cells 
are then disrupted by sonication in an ice water bath. 
Cell debris is then eliminated by centrifugation at 
5 30,000 rpm for 30 min. The supernatant is filtered 
through a 0.45 micron filter and applied to a Qiagen 
(Chatsworth, CA) nickel column at a rate of 
approximately 0.5 ml/min. The column is pre- 
equilibrated with Buffer D (20 mM 10*04 pH 7.4, 10 mM 

10 betamercaptoethanol, ) .5 M KC1, lmM PMSF). The column 
is then washed with 75 ml of Buffer D, followed by 
another 10 ml of Buffer D containing 10 mM imidazole. 
The protein was eluted with 80 mM imidazole in Buffer 
D. The recovered protein is then dialyzed against 

15 dialysis buffer (20 mM KP04 pH 7.4, 10 mM 

betamercaptoethanol, 0.5 M KC1, 0.1 mM EDTA). The MutS 
protein containing an amino terminal histidine tail is 
then ready for use. 

20 Another example of an affinity support is an 

antibody-bound support in which the antibody recognizes 
and binds to a flag sequence, i.e., any amino acid 
sequence (e.g., 10 residues) which the antibody 
specifically binds to. The flag sequence may be 

25 engineered onto the amino terminus of the mismatch 

binding protein. When the heteroduplex/binding protein 
complex is applied to the antibody column, the antibody 
will bind to the flag sequence in the binding protein 
and thus retain the complex. One embodiment of this 

30 technique, known as The Flag Biosystem, is commercially 
available from International Biotechnologies, Inc. (New 
Haven, CT) . Larger flag sequences may be also used; 
e.g., the maltose binding protein, as described by 
Ausubel et al., 1992, supra. 
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Alternatively f or in addition to the first 
fractionation step, the eluted heteroduplex nucleic 
acid is then recycled one or more times through another 
5 affinity binding reaction to refractionate the eluted 
heteroduplexes and thus remove any remaining non- 
specif ically bound and subsequently eluted homoduplex 
nucleic acid. The refract ionated heteroduplexes are 
then also subsequently eluted* 

10 

Other embodiments of affinity fractionation which 
are within the scope of the invention include 
amplification of annealed sample nucleic acid and the 
addition of competitor nucleic acid, as shown in the 

15 figures. For example, the sample nucleic acid may be 
amplified by PCR after the first affinity binding step, 
but before the refractionation step. Thus, the bound 
and eluted heteroduplexes will be amplified and 
repurified on the affinity support. Elution of the 

20 repurified sample nucleic acid should yield relatively 
pure heteroduplex nucleic acid. In addition, excess 
competitor nucleic acid (i.e., unlabeled where the 
sample nucleic acid is labeled, or lacking PCR tails 
where the sample nucleic acid contains PCR tails) may 

25 be added to the sample either prior to or after 

amplification in order to reduce nonspecific mismatch 
protein binding to mismatched nucleic acid. 

Another fractionation method allows for removal of 
30 test-test and/or reference-reference hybrids from a 

sample prior to analysis. As described generally above 
and in more detail below, this method provides for 
differential PCR tailing of duplex fragment ends and 
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thus allows for exponential amplification of test- 
reference hybrids. Thus, a selective reduction is 
achieved in the frequency of test-test and reference- 
reference hybrids within a nucleic acid sample. 

5 

Briefly, as shown schematically in Fig. 10, a first 
PCR tail is ligated onto the 5' end of test nucleic 
acid and a second PCR tail is ligated onto the 3' end 
of reference nucleic acid. This technique is useful as 

10 an intermediate amplification step which is performed 
prior to a refractionation step to limit affinity 
purification to test-reference heteroduplexes. Once 
sample nucleic acid is annealed, a fill-in reaction is 
performed in order to fill in the single stranded 

15 overhanging 5' ends of the test-reference hybrids (see 
Lisitsyn, supra) . A conventional PCR reaction is then 
performed using two PCR primers that are complementary 
to the 5' test PCR tail and the 3' reference PCR tail. 
Because only test-reference hybrids will have the 

20 necessary 5' and 3' tails, the test-reference hybrids 
will be the only heteroduplexes to undergo exponential 
amplification. 

In yet another fractionation method useful 
25 according to the invention, second-order kinetics of 
self-association can be used to further enrich sample 
nucleic acid for fragments that are more prevalent than 
others (see Wieland et al., 1990, Proc. Nat. Aca. Sci. 
87:2720, hereby incorporated by reference). After 
30 sample nucleic acid is enriched for fragments that 

contain base pair mismatches, e.g., using MutS affinity 
fractionation, as described herein, these MutS-binding 
fragments can be further enriched for the relevant 
sequence using kinetic-enrichment. 
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Kinetic-enrichment is based on the following 
principle. If a population of nucleic acid fragments 
containing a target subpopulation enriched X times 
5 relative to unenriched fragments in the sample is 
melted and reannealed so that only a small proportion 
of double-stranded nucleic acid forms, double-stranded 
target nucleic acid would be present X 2 times relative 
to the other sequences present as duplex nucleic acid* 

10 To visualize this, consider viral sequences present in 
excess (ten times more) relative to single-copy p- 
globin sequences. At early stages of self-reannealing, 
when 5.0% of the viral sequences are reannealed, only 
0.5% of the 0-globin sequences will be reannealed. The 

15 ratio of the viral sequence to the p-globin sequences 
in the double-stranded DNA will then be 5% of 10 to 
0.5% of 1 (i.e., 100-fold more). 

The kinetic-enrichment technique is useful 

20 according to the invention as follows. Sample nucleic 
acid is prepared by combining test and reference 
nucleic acids under denaturing and reannealing 
conditions. The sample is then enriched for 
heteroduplexes thus formed, e.g., by MutS affinity 

25 fractionation, as described herein. The MutS-bound 
heteroduplexes are then released, and the heteroduplex 
sample kinetically enriched, e.g., is again subjected 
to denaturation and annealing so that only a small 
proportion of the sample forms duplexes. Duplexed 

30 nucleic acid is then selected as described herein. 
Because duplex formation will occur at a much higher 
rate for those fragments that were enriched in the 
original sample (see Lisityn, supra ) , the technique 
serves to further enrich the sample for these 

35 fragments • 
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Thus, a first PCR tail sequence is ligated to the 
5' ends of the test nucleic acid sample, while a second 
PCR tail sequence is ligated to the 3' ends of the 
reference nucleic acid sample. The samples are 
5 combined, denatured, reannealed, and then affinity 
purified by selecting for those duplexes which bind to 
MutS. The thus-selected MutS-binding duplexes are then 
further enriched using kinetic-enrichment as follows. A 
fill-in reaction is performed so that hybrids that have 

10 one 5' PCR tail and one 3' PCR tail no longer contain 
single-stranded ends. PCR amplification is then 
performed using primers complementary to the 5* and 3' 
PCR tails. Only those hybrids that contain both the 5' 
and 3' first and second PCR filled-in ends will undergo 

15 exponential amplification. 

The fractionation procedure allows for a reduction 
in the number of homoduplexes in the mixture in the 
bound fraction; consequently, in the detection or 

20 analysis steps, there will be fewer non-specific 
binding interactions between the mismatch binding 
protein and homoduplex nucleic acid. The sensitivity 
of detection and/or quantitation of heteroduplex 
nucleic acid in a test sample may be further increased 

25 by refractionating the eluted sample, or by 

ref ractionating the flow-through fractions through 
repeated affinity steps in which heteroduplexes present 
either in the eluate or flow-through are selectively 
retained on the solid support. 

30 

After each refractionation binding reaction, bound 
heteroduplex nucleic acid is eluted and subsequently 
applied to a fresh or regenerated support. 
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Alternatively , the support may contain a vast excess of 
binding sites, thus making intermediate elution steps 
unnecessary. 

5 The solid support useful in the invention may be 

any one of a wide variety of supports, and may include 
but is not limited to, synthetic polymer supports, 
e.g., polystyrene, polypropylene, substituted 
polystyrene, e.g., aminated or carboxylated 

10 polystyrene, polyacrylamides , polyamides. 

polyvinylchloride, etc.; glass bead, agarose; 
cellulose, or any material useful in affinity 
chromatography (see Pharmacia LKB Biotechnology 
Products Catalog, 1992, Piscataway, NJ, hereby 

15 incorporated by reference). The supports may be 

provided with reactive groups, e.g. carboxyl groups, 
amino groups, etc., to permit direct linking of the 
protein to the support. The mismatch binding protein 
can either be directly crosslinked to the support, or 

20 proteins (e.g., antibodies) capable of binding the 

mismatched binding protein or the nucleic acid/binding 
protein complex can be coupled to the support. 

For example, if the support includes sepharose 
25 beads and the mismatch binding protein is coupled to 
the beads, the binding protein coupled-beads are packed 
into a column, equilibrated, and the column is 
subjected to the nucleic acid sample. Under 
appropriate binding conditions, the protein that is 
30 coupled to the beads in the column retains the nucleic 
acid fragments or the protein/nucleic acid complex 
which it recognizes. The column is then washed of 
unbound nucleic acid, and the bound nucleic acid 
fragments or protein/nucleic acid complexes are eluted 
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according to conventional techniques known in the art, 
e.g., using a solution containing salt (e.g., KC1), 
detergent or imidazole, that reduces the binding 
between the nucleic acid and protein on the support or 
5 the protein/nucleic acid complex and the support; e.g. 
see Scopes, Protein Purification: Principles and 
Practice, 1982, Springer-Verlag, NY, or Ausubel, 1992, 
Current Protocols, supra , both of which are hereby 
incorporated by reference). Conditions for binding and 
10 elution of heteroduplex nucleic acid or 

heteroduplex/binding protein complexes are typically 
identical to the conditions described herein for the 
mismatch binding protein/heteroduplex binding reaction. 

15 The protein may be linked to the support by a 

variety of techniques including adsorption, covalent 
coupling, e.g., by activation of the support, or by the 
use of a suitable coupling agent or the use of reactive 
groups on the support. Such procedures are generally 

20 known in the art and no further details are deemed 
necessary for a complete understanding of the present 
invention. Representative examples of suitable 
coupling agents are dialdehydes, e.g., glutaraldehyde, 
succinaldehyde, or malonaldehyde; unsaturated aldehyde, 

25 e.g., acrolein, methacrolein, or crotonaldehyde; 

carbodiimides? diisocyanates; dimethyladipimate; and 
cyanuric chloride. The selection of a suitable 
coupling agent should be apparent to those of skill in 
the art from the teachings herein. 

to 

Any method that permits the purification of 
protein/nucleic acid complexes away from free nucleic 
acid may be used, e.g., at steps 3-5 of Fig. 4. 
Methods of affinity purification of mismatch binding 
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protein/heteroduplex complexes include 
immunoprecipitation. See Ausubel, 1992, Current 
Protocols, supra , and Harlow et al., 1988, Antibodies: 
A Laboratory Manual, supra . Alternatively, antibodies 
5 to the mismatch binding protein/heteroduplex complex 
can be attached to any solid support that permits the 
washing away of free nucleic acid. Alternatively, 
•immobilized metal affinity chromatography may be used 
to purify histidine-tailed mismatch binding protein 
10 that is bound to heteroduplexes . 

Additional forms of affinity purification of 
mismatch binding protein/heteroduplex complexes include 
the use of nitrocellullose filters that bind protein 

15 but not free nucleic acid, or the use of a gel 

electrophoresis mobility shift nucleic acid-binding 
assay, both of which are described in Ausubel (1992, 
supra ) . For example, the method of the invention shown 
schematically in Fig. 4 may include a gel mobility 

20 shift assay at step 2 of the procedure. Nucleic acid 
fragments that are bound by mismatch binding protein 
are identified by their mobility shift. The identified 
fragments are isolated (steps 4 and 5) by excising them 
from the gel, and purifying them away from the gel 

25 material, as described in Ausubel. 

VII. Utilization of Heteroduplexes 

The inventive methods disclosed herein allow for 
30 recovery of nucleic acid fragments containing 

nucleotide sequence mismatches. Described below are 
some of the ways in which these recovered fragments may 
be used. For example, a recovered heteroduplex sample 
may be used to determine the identity and position of 
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the mismatch by determining the nucleotide sequence of 
the mismatch region and comparing the sequence with 
sequence data from reference nucleic acid. Other 
examples of ways to utilize the isolated heteroduplexes 
5 are as follows. 

Heteroduplexes may be used to quantitatively 
determine the fraction, of heteroduplex fragments in a 
mixture and the proportion of mismatch binding protein 

10 bound to heteroduplex nucleic acid, and thus may be 
used to determine the number of fragments containing 
mismatches within a sample. Labeling of the input test 
or reference nucleic acids allows for quantitation of 
label in both the input and output affinity 

15 fractionated samples (Fig. 2). Thus, the amount of 
label present in the output sample may be used to 
quantitate the number of heteroduplexes relative to the 
known amount of labeled input sample. 

20 Labeling of the mismatch binding protein (e.g., 

with 35 S-methionine) also allows for detection and 
optional quantitation of the fraction of heteroduplex 
fragments in a mixture. For example, as shown in Fig. 
5, one method includes immobilizing reference nucleic 

25 acid on a solid support, such as a membrane, 

hybridizing of the immobilized reference nucleic acid 
to test nucleic acid, exposing the membrane to mismatch 
binding protein under binding conditions such as those 
specified herein, and then washing away free mismatch 

30 binding protein. Alternatively, test nucleic acid may 
be immobilized to the support and hybridized to free 
reference nucleic acid prior to binding. 
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in addition, a moiety that permits affinity 
purification of nucleic acids can be used to modify the 
test or reference nucleic acids for detection; e.g., 
biotin. After the mixture of modified (e.g., biotin- 
5 labeled) nucleic acids is exposed to the mismatch 
binding protein, the mixture may then be selectively 
enriched for the nucleic acid/binding protein complexes 
by affinity purification. During this step, the free 
nucleic acid and free mismatch binding protein will be 

10 washed away. Once the nucleic acid mixture has been 
separated from free mismatch binding protein, the 
amount of label present in the bound nucleic acid 
sample may be used to guantitate the number of 
heteroduplexes in the mixture. Similarly, the amount 

15 of label present in the bound protein may be used to 
determine the number of mismatches present in the 
mixture. Alternatively, instead of labeling the 
mismatch binding protein, other methods for detecting 
the presence of the mismatch binding proteins can be 

20 used for quantitation of mismatches, such as an enzyme- 
linked immunoassay. 

If the goal of the genetic screening method is to 
identify not only the presence of a nucleotide sequence 

25 mismatch between test and reference nucleic acids, but 
also to determine the nature and location of the 
mismatch, then the affinity purified heteroduplex 
nucleic acid can be cloned and sequenced to determine 
the precise sequences and sequence differences between 

30 the test and reference nucleic acids. For example, in 
the genetic disease hemophilia is caused by many 
different mutations in a 26,000 base region of nucleic 
acid in the gene encoding blood clotting factor VIII. 
Thus, it is not possible to diagnose the disease by 
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10 



identifying a known mutation. However, it is possible 
to detect the many possible mutations which may be a 
cause of hemophilia according to the invention, other 
genetic diseases, e.g., Huntington's disease, in which 
neither the nature or location of the mutation which 
causes the disease is known, may be both diagnosed 
according to the invention, and also characterized as 
to the identity (i.e., the nature and/or location) of 
the underlying mutation. 



Differential cloning of genomic nucleic acid can be 
used with complex nucleic acid samples to eliminate 
background heteroduplex molecules? i.e., heteroduplexes 
that are formed when a sample is annealed with itself 

15 due to the presence of non-unique sequences. This 

technique is illustrated schematically in Fig. 8. For 
example, if nucleic acid A and nucleic acid B are to be 
compared for nucleotide sequence differences, and both 
samples are a complex mixture of nucleic acid, when the 

20 two samples are combined, and denatured and reannealed, 
many heteroduplexes will form which are not the A/B 
heteroduplexes which it is the goal to identify, i.e., 
which contain one strand from sample A mutated gene x 
and the other strand from reference B normal gene X. 

25 Instead, background heteroduplexes will form which 

contain strands of non-unique nucleic acid that anneal 
because they are largely homologous; i.e., A/A or B/B 
heteroduplexes. This background problems may be 
reduced using the differential cloning method described 

30 above, as follows. 

Heteroduplexes from denatured and reannealed A/a 
nucleic acid and denatured and reannealed B/B nucleic 
acid may be combined to form the reference nucleic 
35 acid. The test nucleic acid (A/B heteroduplexes) will 
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include A DNA and B nucleic acid that is denatured and 
reannealed together rather than separately. The 
reference (A/A and B/B) nucleic acid is 
dephosphorylated to prevent ligation of unwanted 
5 heteroduplexes to dephosphorylated vector nucleic acid/ 
and then combined with test nucleic acid 
(heteroduplexes of A/B nucleic acid) in a ratio of 
approximately 100 (reference) to 1 (test). The 
combined mixture is separated by size on an agarose gel 

10 and again denatured and reannealed in the gel. In the 
reannealing process, unique A/B strands are more likely 
to reanneal than non-unique strands because the latter 
are more likely to reanneal with excess reference 
strands. Cloning of the unique A/B test strands will 

15 be highly favored due to the inability of 

dephosphorylated A/A or B/B DNA to ligate to the 
dephosphorylated vector. The differential cloning 
technique may be varied as desired using the knowledge 
of a person of skill in the art. 

20 

Alternatively, instead of using differential 
cloning of genomic DNA, representational difference 
analysis (RDA) can be used in Fig. 8 (see Lisitsyn et 
al. , supra ) . 

25 

In some circumstances, the goal of the genetic 
screening may not be to identify the precise mismatch, 
but to determine the sizes of heteroduplex nucleic acid 
in an annealed sample identified as containing 
30 heteroduplex nucleic acid. The size of a heteroduplex 
may be determined by agarose gel electrophoresis of 
affinity purified duplexes. Once the size of 
heteroduplex fragments are known, size parameters may 
be used to map the locations of differences in simple 
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10 



15 



nucleic acid samples, such as plasmid DNA or to map the 

locations or differences in more complex samples via 

Southern blotting of heteroduplex nucleic acid. 

Furthermore, where a region of interest is well-defined 

or where genetic markers are known, other techniques 

may be used, e.g., Restriction Fragment Length 

Polymorphism analysis to analyze heteroduplex nucleic 
acid. 



The purified heteroduplex nucleic acid may be used 
as a probe to screen a genomic library for other 
sequences of interest. The heteroduplex-containing 
sample may be further purified by affinity 
fractionating the heteroduplexes , and/or PCR amplifying 
the annealed mixture or ref ractionating the affinity 
purafxed heteroduplexes, and cloning the heteroduplex 
molecules. 



20 



25 



30 



In addition, any conventional technique for 
comparing nucleic acids, e.g., denaturing gradient gel 
electrophoresis, can be used to further analyze the 
heteroduplex nucleic acid. 

When comparing complex nucleic acid samples, it is 
important to eliminate background; e.g., false 
positives, or positive signals generated by reannealing 
of two different regions within the same test nucleic 
acxd sample that contain some homology and some 
sequence differences. Background can be eliminated by 
using controls in which the test nucleic acid or 
reference nucleic acid is denatured and reannealed with 
itself. Computer-based assistance can be employed to 
eliminate these artifacts. For example, a computer can 
be programmed to examine the digitized images from the 
gel electrophoresis of reannealed test nucleic acid 
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and/or reannealed reference nucleic acid comparisons, 
and to remove these artifacts from the digitized gels 
images resulting from a test/reference heteroduplex 
comparison. 

5 

VIII. Detection of Heteroduplex nucleic acid in a 
Mixture of Excess Competitor nucleic acid 

The following experiment demonstrates that a test 
10 and a reference nucleic acid sequence may be hybridized 
and a single base pair differences is detectable. In 
this example, the nucleotide pair mismatch is known, 
and the procedure results in detection of mutations in 
a 16-mer substrate. In addition, 16-mer heteroduplex 
15 nucleic acid was fractionated from homoduplex (i.e., 

fully complementary) nucleic acid. A 16-mer homoduplex 
control was used to ensure that the method did not 
fractionate matched nucleic acid to the same degree. 
Both of the fragments were fractionated in the presence 
20 of a large amount of (i.e., excess) competitor nucleic 
acid to ensure the method could detect mismatches in a 
background of matched nucleic acid. 

Nucleic acid samples were prepared as follows. The 
25 oligonucleotides DG6R (GAT CCG TCG ACC TGC A), DG4R 

(CTA GGC AGT TGG ACG T) and DG5 (CTA GGC AGC TGG ACG T) 
were ordered from Operon Technologies (Alameda, 
California) and separately resuspended in TE buffer to 
a concentration of 10 pHol / ul. DG6R was kinased with 
30 5000 Ci/mmol 32 P ATP. Lambda ladder DNA from Bethesda 
Research Laboratories (Bethesda, MD) was used as a 
competitor DNA. 
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Heteroduplexes were created as follows* 8 pmol of 
the kinased DG6R and 10 pMol of DG4R in 40 ul of assay 
buffer were placed in a 70°C water bath for 10 minutes* 
The water bath was then switch off and allowed to cool 
5 to room temperature to allow the oligonucleotides to 
anneal* The result of this annealing reaction was 
called DG-4/6 Het. The same annealing reaction was run 
between DG-5 and DG-6R, and the result of this reaction 
was called DG-5/6 Horn. DG-4/6 Het. contains a GT 
10 mismatch in place of the GC match present in DG-5/6 
Horn* 

The MutS protein was over produced, as described by 
flaber (1988 f supra 1 , at 42°C in MM294 mutS::TnlO cells 

15 that carried the lambda cI857 gene on pSE103 (Ellege et 
al. r 1985, J* Bacteriol. 162:777) and the MutS gene on 
PGW1825 (Haber 1988, supra V, all references of which 
are hereby incorporated. MutS was purified using the 
method of Su and Modrich (1986, supra ) . Dilution 

20 buffer for MutS includes 0.02 M KP04 pH 7.4/0.05 M 
KC1/0.1 mM EDTA/1 mM dithiothreitol/0 . 1 mg/ml bovine 
serum albumin. The purified and concentrated fraction 
containing MutS was used in the following experiments. 
MutS polyclonal antibody was also produced according to 

25 the method of Haber (1988, supra V. The binding of MutS 
to heteroduplex nucleic acid was performed in assay 
buffer, as described above. 

Affinity fractionation of heteroduplex nucleic acid 
30 was performed as follows. Two binding reactions were 
incubated on ice for 30 minutes, one containing 
heteroduplex nucleic acid and a control containing 
homoduplex DNA. The heteroduplex reaction contained 
14.5 pMol of MutS, 200 fMol of DG-4/6 Het, and 2 ug of 
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competitor nucleic acid in a total volume of 20 ul. 
The control reaction contained 14.5 pMol of MutS, 200 
fMol of DG-5/6 Hom r and 2 ug of competitor nucleic acid 
in a total volume of 20 ul. After 30 minutes on ice, 5 
5 ul of anti-MutS antibody was added to each binding 
reaction, and the result was incubated on ice for 60 
minutes. 10 ul of Staphylococcus aureus cells that had 
been washed twice in assay buffer were added to both 
binding reactions (see McKay, 1981, supra ) and the 
10 result was incubated on ice for an additional 30 

minutes. Both reactions were then spun in a microfuge 
for 3 minutes at 4°C and the pellet was washed 8 times 
in assay buffer. 

The pellet from each binding reaction was counted 
in a scintillation counter to test for 
immunoprecipitation of heteroduplex nucleic acid. 
After normalizing for the total number of counts in 
each reaction, 53 fold more oligonucleotides 
precipitated in the heteroduplex reaction than in the 
homoduplex reaction. Thus, heteroduplexes containing a 
single base pair mismatch could be detected after 
affinity fractionation of a mixture containing excess 
competitor nucleic acid. 

IX. Detection of a Mismatched Nucleotide Pair in a 
1 KB Fragment 

The invention may be used to identify a single base 
30 pair change in a 1 KB region of nucleic acid in the 
presence of an excess of matched nucleic acid 
competitor. 
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DNA samples and heteroduplexes were prepared as 
follows. Single stranded circular DNA from M13mp8 DNA 
containing a G to A transition mutation in the unique 
PstI site (see Loechler, 1984, Proc. Nat. Aca. Sci. 
5 USA 80:6271, hereby incorporated by reference) was 

denatured and annealed in the presence of linear duplex 
wild-type M13mp8 DNA to create a heteroduplex (see 
Kramer et al. f 1989, J. Bacteriol. 171:5339, hereby 
incorporated by reference). The heteroduplex thus 
10 formed contained a C-A mismatch in the PstI site, which 
prevented cleavage of the site by PstI. Control 
homoduplex DNA was created using the sense and 
antisense strands of wild-type M13mp8 DNA. The 1 KB 
Avall-Bglli fragment containing the mismatch was 
isolated from both the heteroduplex and wild-type 
homoduplex DNA by gel purification. The resulting 
homoduplex and heteroduplex fragments were separately 
phosphatased and end labeled with 32 P ATP. Free ATP 
was eliminated with spin columns from the labeled 
heteroduplex and homoduplex 1 KB DNA fragments. Lambda 
ladder DNA from BRL was used as a competitor. 



15 



20 



Affinity fractionation of heteroduplex nucleic acid 
was performed as follows. Two binding reactions were 
25 incubated on ice for 30 minutes, one of which contained 
the mismatched nucleic acid and a control which 
contained matched nucleic acid. The heteroduplex- 
containing reaction consisted of 42pMol of Muts, 7 fMol 
of the C-A mismatched 1 KB fragment, and 1 ug of 
competitor nucleic acid in a total volume of 10 ul. 
The homoduplex reaction contained the same components, 
but substituted matched nucleic acid for the mismatched 
heteroduplex nucleic acid. After 30 minutes on ice, 10 
ul of anti-MutS antibody was added to each binding 
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reaction, and the result was incubated on ice for 60 
minutes. Then 10 ul of SAC cells that had been washed 
twice in assay buffer were added to both binding 
reactions, and the result was incubated on ice for an 
additional 30 minutes. Both binding reactions were 
then spun in a microfuge for 3 minutes at 4°C, and the 
resulting pellet was washed 6 times in assay buffer. 

The pellet from each binding reaction was counted 
in a scintillation counter to test for specific 
fractionation of heteroduplex nucleic acid. After 
normalization for the total number of counts in each 
reaction, 9.6 fold more fragments precipitated in the 
heteroduplex reaction than in the homoduplex reaction. 
Thus, a mismatch of a single nucleic acid base pair 
could be detected in presence of a large amount of 
competitor nucleic acid. 

X. Detection of a Mismatched Nucleotide Pair in a 
Mixture of Nucleic Acid Fragments 

The invention may be used to detect a single 
nucleotide pair mismatch in a mixture of nucleic acid 
fragments, as described below. 

A mixture of homoduplex and heteroduplex nucleic 
acid was prepared from purified Pstl+ and PstI- M13mp8 
DNA. The Pstl+ DNA is wild-type M13mp8 DNA, which is 
cleavable by the restriction enzyme PstI when in 
double-stranded form, while the PstI- DNA is M13mp8 DNA 
with a single base C to T mutation in the unique PstI 
site (the second C in the PstI site is the one that is 
mutated which prevents cleavage by PstI). 75 ug of 
both PstI- DNA and Pstl+ DNA were separately cleaved 
with the EcoRI and Pvul restriction enzymes in a total 



WO 93/22457 



PCT/US93/03777 



- 58 - 

volume of 250 ul each. 200 ul of each reaction were 
combined, phenol/chloroform extracted, ethanol 
precipitated, and resuspended in lx SSC in an eppindorf 
tube. The tube was boiled in a beaker over a hot water 
5 bath for 10 minutes, and then left to cool to 65 

degrees for 15 minutes, then moved to a 65 degree water 
bath, which was switched off and left overnight to 
cool. The sample was run on a 2% agarose gel, and the 
159 bp band was excised. The 159 bp fragments were 
10 purified from the gel slice and resuspended in TE 

buffer. The fragments were then labeled with 32 P dATP 
in a Klenow fill-in reaction. The unincorporated dATP 
was eliminated with a spin column. The purified DNA 
included both heteroduplex and homoduplex nucleic acid. 

Mismatch binding protein was bound to the nucleic 
acid mixture in a total volume of 10 ul consisting of 
lul of the DNA mixture (19 fMol), 2 ul of the mismatch 
binding protein MutS (tag), and 1 ul of poly dldC 

20 competitor nucleic acid (lug). A control reaction was 
identically prepared except that it did not contain 
MutS. Binding was performed on ice for 30 minutes. 
The MutS reaction and the control reaction were 
electrophoresed on a 6% non-denaturing tris- 

25 acrylaminide-EDTA (TAE) gel. 2 uL of a 50% sucrose 
solution was added to each reaction just prior to gel 
loading. 

Figure 7 shows results from an autoradiogram of the 
30 polyacrylamide gel. In lane 1, the control reaction 

shows a single 159 bp band, while Lane 2 shows both the 
159 bp band arising from the homoduplex component of 
the DNA mixture and a larger molecular weight shift 
band corresponding to the heteroduplex component of the 
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mixture. Lane 3 shows another control in which the 
MutS protein was heated prior to the binding reaction. 
As the results show, heat denatured MutS does not bind 
to heteroduplex nucleic acid and thus does not result 
5 in a band shift in the gel. 

XI. Preparation of Histidine-tailed MutS Protein 

A variant of the native Salmonella MutS protein was 
10 created that contained six histidines at its amino 
terminus to facilitate purification of the His-MutS 
protein or recovery of the His-MutS 
protein/heteroduplex nucleic acid complex. 

15 The wild type Salmonella MutS gene was PCR 

amplified from the plasmid pGW1811 using the following 
primers : 

DKG-MUTS5T 

20 5' CGG AAT TCG CAT CAT CAT CAT CAT CAT ATG AAT GAG TCA 
TTT GAT AAG G (SEQ ID NO. 1) 

DKG-MUTS3X 

5' CGC GGA TCC TTA CAC CAG ACT TTT CAG CCG 
25 (SEQ ID NO. 2) 

The amplified nucleic acid fragment was cut with 
EcoRI and BamHI and cloned into the polylinker site of 
pUC18, which placed the MutS-encoding DNA under the 
30 control of the inducible Lac promoter. The resulting 
plasmid, called pDKGAl, was used to transform the 
E.coli strain GW3732 (Haber, 1988 supra ) . 



35 



A clone (GW3732 pDKGAl) was isolated which 
contained the plasmid pDKGAl. Because the Lac 
expression system permits a moderate level of basal 
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10 



15 



20 



25 



30 



transcription, some His-MutS protein is produced even 
under conditions which result in repression of the lac 
promoter. This low level of His-MutS production 
results in poor growth of the transformed cells, and 
the selective pressure can result in loss of the 
plasmid from the transformed cells. Thus, care was 
taken to ensure that the culture did not grow to high 
density under selective conditions. The His-MutS 
protein was prepared and purified as follows. 

Two 1 liter cultures of GW3732 cells containing 
plasmid pDKGA were grown with shaking at 37»C to an 
OD 600 of °- 75 - Tfa e cultures were then induced to 
produce His-MutS by adding 1 mM IPTG. The cells were 
grown for another two hours, and then harvested by 
centrifugation to a cell pellet, decanting the 
supernatant, and freezing the pellets at -80*C. 

A 500 ml culture pellet was then defrosted on ice 
and resuspended in lysis buffer (20 mM KP04 pH 7.4, 10 
mM betamercaptoethanol, 0.5 M KCI, 1 mM PMSF, 200 ug/ml 
lysozyme). The cells were sonicated in an ice water 
bath, cell debris was eliminated by centrifugation at 
30,000 rpm for 30 minutes. The supernatant was 
filtered through a 0.45 micron filter and applied to a 
Qiagett nickel column at at flow rate of 0.5 ml/minute. 
The column was pre-equilibrated with Buffer D (20 mM 
KP04 pH 7.4, 10 mM betamercaptoethanol, 0.5 M KCI, 1 mM 
PMSF) . The column was washed with 75 ml of Buffer 0, 
followed by another 10 ml wash of Buffer D with 10 mM 
imidazole. The protein was eluted with 80 mM imidazole 
m Buffer D. The recovered protein was dialyzed 
against dialysis buffer (20 mM KP04 pH 7.4, 10 mM 
betamercaptoethanol, 0.5 M KCI, 0.1 mM EDTA) . Fig. 9 
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is a polyacrylamide gel showing results of histidine- 
tailed MutS purification using an imidazole gradient. 
The His-MutS protein appears in the purification near 
the 97 KD marker. Histidine-tailed MutS produced as 
5 described above was shown to be biologically active in 
selective binding to nucleic acid mismatches as 
follows . 



XII. Selective Purification of Heteroduplex Nucleic Aci 
10 d Using Histidine-tailed MutS Protein 

Homoduplex and heteroduplex nucleic acid were 
prepared as follows. Three oligonucleotides: 
SRB-5-G 3' GAC ATC TGA TCC GTC GAC CTG CAG ATG AAG A 5' 
15 (SEQ ID NO. 3) 

SRB-3-T 5' CTG TAG ACT AGG CAG TTG GAC GTC TAC TTC T 3' 

(SEQ ID NO. 4) 

SRB-3-C 5' CTG TAG ACT AGG CAG CTG GAC GTC TAC TTC T 3' 

(SEQ ID NO. 5) 

20 

were obtained from Operon Technologies (Alameda, 
California). Each oligonucleotide was resuspended in 
TE buffer to a concentration of 10 pMol/jil. SRB-3-T 
was end labeled in a kinase reaction using 5000 Ci/nunol 
25 32 P-ATP. 



Heteroduplex nucleic acid was prepared by combining 
8 pMol of the kinased SRB-5-G oligonucleotide and 10 
pMol of the SRB-3-T oligonucleotide, followed by 

30 incubation of the combined oligonucleotides in a 70°C 
water bath for 10 minutes. The oligonucletoides were 
allowed to anneal by switching off the water bath, and 
allowing it to cool to room temperature. The duplex 
formed as a result of this annealing reaction was 

35 called SRB/HET. 



SUBSTITUTE SHEET 
ISA/EP 
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Homoduplex nucleic acid was prepared by combining 8 
pMol of the kinased SRB-5-G oligonucleotide and 10 pMol 
of the SRB-3-C oligonucleotide, and treating the 
combined oligonucleotides as described above for 
5 preparation of heteroduplex SRB/HET. The resultant 
homoduplex nucleic acid was called SRB/HOM. SRB/HET 
and SRB/HOM differ in that the heteroduplex nucleic 
acid contains a GT mismatch in place of a GC match 
present in the homoduplex nucleic acid. 

Affinity fractionation of heteroduplex nucleic acid 
was accomplished by performing a binding reaction 
between the duplex nucleic acid and the His-MutS 
mismatch binding protein prepared as described above. 
Briefly, two binding reactions were performed, one 
containing heteroduplex nucleic acid and a control 
containing homoduplex nucleic acid. The heteroduplex 
reaction contained 200 fMol of SRB/HET and 100 pMol of 
His-MutS, and binding was performed on ice for 30 
minutes in assay buffer (20 mM rKP0 4 pH 7 .6, 5 mM 
MgCl 2 , 0.1 mM betamercaptoethanol ) . The homoduplex 
binding reaction was performed using 200 fMol of 
SRB/HOM in place of SRB/HET under the same conditions. 

Each reaction was added to 100 fil of Ni-NTA 
(nickel) resin (Qiagen) in a spin column that had been 
washed in assay buffer. After addition of the reaction 
mixtures, each spin column was washed six times with 
assay buffer containing 1% Triton, and bound DNA was 
eluted with 1 M imidazole, pH 7.0. in the case of the 
SRB/HET DNA, 27% of the dna was recovered, while in the 
case of the SRB/HOM DNA, 2% of the DNA was recovered. 
The results demonstrate that the His-MutS mismatch 
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protein selectively binds heteroduplex nucleic acid, 
and that the His-MutS/heteroduplex nucleic acid complex 
may be selectively retained via affinity purification 
on a nickel column. 

5 

XIII. Selective Recognition and Purification of 

Mutations in the ARC Gene using PCR Amplified 
Nucleic Acid 

10 Heteroduplex and homoduplex nucleic acid were 

prepared as follows. Plasmids derived from pTA200 
containing the wild-type ARC gene and EG36 mutant ARC 
gene (Vershon et al.. Proteins: Structure, Function and 
Genetics 1:302, 1986, hereby incorporated by reference) 

15 were isolated and used in separate PCR reactions to 
amplify a region of the ARC gene. PCR reactions 
included 100 ng of plasmid DNA, 60 pMol of both of the 
primers ARC5-1 and ARC3-5, and standard PCR reaction 
components (i.e., PCR buffer, thermostable DNA 

20 polymerase, 2 mM of each oligonucleotide). The primer 
oligonucleotides have the following sequences: 

ARC5-1 CCG GCG GAT GAA AGG AAT GAG CAA AAT G 

(SEQ ID NO. 6) 

25 ARC3-5 GGC TTC AAC TTT ACG CGC CAA 

(SEQ ID NO. 7). 

PCR reaction products from the wild-type and EG36 
plasmids were gel purified on a 1.5% TAE (tris- 
30 acrylamide EDTA) gel, and the 200 bp band was isolated 
from both. The gel-purified 200 bp PCR products derived 
from the wild- type and EG36 plasmids were named ARC-WT 
and ARC-EG36, respectively. 

35 A mixture of heteroduplex nucleic acid and 

homoduplex nucleic acid, ARC-WT/EG36 was created as 
follows. A total of 500 ng of both ARC-WT and ARC- EG 3 6 
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were combined in a 50 mM KCl solution and boiled for 
five minutes in a water bath. The sample was then 
allowed to cool slowly to room temperature, and then gel 
purified on a 1.5% TAB gel. The resulting DNA contained 
5 both homoduplex nucleic acid and heteroduplex nucleic 
acid with GT and CA mismatches. The DNA was then 
kinased with 32 P-ATP, and unincorporated ATP was 
separated using a spin column. 

10 ARC-WT/WT homoduplex nucleic acid was created as 

follows. A total of 1000 ng of ARC-WT DNA was suspended 
in a 50 mM KC1 solution and boiled for five minutes in a 
water bath. The sample was then allowed to cool slowly 
to room temperature, and then gel-purified on a 1.5% TAE 

15 gel. The resulting DNA contained homoduplex DNA that 
had been reannealed. The DNA was then kinased with 32 P- 
ATP, and unincorporated ATP was separated using a spin 
column. 

20 Affinity purification of heteroduplex DNA was 

performed as follows. A total of 800 fMol of ARC- 
WT/EG36 was combined on ice with a final concentration 
of 0.8 uM His-MutS in assay buffer (20 mM KP0 4 pH 7.4, 5 
mM MgCl 2 , 0.4 mM P-mercaptoethanol) . After incubation 

25 for 30 min. on ice, the reaction was added to a spin 
column of Ni-NTA nickel resin. Before use, the spin 
column was washed and equilibrated in assay buffer. 
After the reaction was added to the spin column, the 
column was washed six times with assay buffer and 1% 

30 triton, and eluted with 1 m imidazole pH 7.0. An 

identical affinity purification reaction was performed 
with ARC-WT/WT. In the case of ARC-WT/EG36, 4% of the 
DNA was recovered, and in the case of ARC-WT/WT, 2% of 
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the DNA was recovered. The results demonstrate that the 
His-MutS mismatch protein selectively binds heteroduplex 
DNA, and that the His-MutS/heteroduplex DNA complex may 
be selectively retained via affinity purification. 

5 

XIV. Selective Recognition and Purification of 
Amplified Human Nucleic Acid Containing a 
Genetic Mutation 

10 A genetic mutation contained within human nucleic 

acid may be detected as follows. Nucleic acid encoding 
wild type and mutant human 0-globin sequences may be 
cloned into plasmids as described by Abrams et al. r 
Genomics 7:463, 1990, hereby incorporated by reference. 

15 the plasmid pEGb0c39, described in Abrams et al., 

contains a naturally occurring C to T mutation in codon 
39 of the p-globin gene; the plasmid pEGwt contains the 
wild- type sequence. These DNA fragments are amplified 
by performing large scale plasmid preparation of pEGwt 

20 and pEGb0c39. Each amplified DNA is then digested with 
the restriction enzymes Ncol and BamHI, phenol 
extracted, and ethanol precipitated. 

Heteroduplex nucleic acid is then formed as 
25 follows. 25 ug of digested pEGb0c39 and 25 ug of 

digested pEGwt DNA are combined in a 50 ul volume of 50 
mM NaCl (0-Het DNA). The sample is then heated to 99°C 
for more than 5 min. and allowed to cool slowly to room 
temperature. The same reactions is performed using 
30 pEGwt DNA to form p-Hom DNA. Each of p-Het and p-Hom 
are then gel-purified as 438 bp NcoI-BamHI fragments. 
Purified p-Het fragment is called BO/WT DNA and 
purified p-Hom is called WT/WT DNA. 



35 



Affinity fractionation of heteroduplex nucleic acid 
is performed as follows. Two binding reactions are 
incubated on ice for 30 minutes, one containing the 
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BO/WT DNA and a control containing WT/WT DNA. The 14 
ul binding reactions contain appropriate amounts of DNA 
and His-MutS protein. Binding is performed in assay 
buffer (20 mH Tris-Cl pH 7.6, 5 mM MgClj, 0.01 mM EDTA, 
5 0.1 mM DTT). Each binding reaction is added to 100 ul 
of nickel resin in a spin column that has been washed 
in assay buffer. The two spin columns are washed six 
times, and DNA is eluted with 1 M imidazole, pH 7.0. 

Other Embodiments 

10 

Other embodiments are within the following claims. 

It is futher anticipated that other kinds of 
mismatches, such as asymmetric methylation, can be 
detected with proteins that bind to hemi-methylated 
15 nucleic acids, such as methyltransf erases , e.g., dam. 
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Claims 

1 1. A method of genetic screening for a nucleotide 

2 variation, said method comprising 

3 (A) providing an amplified test nucleic acid 

4 suspected to contain a nucleotide variation and a 

5 reference nucleic acid; 

6 (B) subjecting said test and reference 

7 nucleic acids to conditions sufficient to produce an 

8 annealed mixture comprising a heteroduplex , wherein 

9 each said heteroduplex comprises a mismatched 

10 nucleotide pair; 

11 (C) subjecting said annealed mixture to a 

12 mismatch binding protein under conditions sufficient to 

13 bind said binding protein to said mismatched nucleotide 

14 pair, one member of which comprises said suspected 

15 nucleotide variation; and 

16 (D) detecting, as an indication of a genetic 

17 variation between said test and reference nucleic 

18 acids, the presence of said mismatched nucleotide pair. 

1 2. A method of genetic screening for a nucleotide 

2 variation, said method comprising 

3 (A) providing a test nucleic acid suspected 

4 to contain a nucleotide variation and a reference 

5 nucleic acid; 

6 (B) annealing said test and reference nucleic 

7 acids under conditions sufficient to produce a mixture 

8 comprising a first concentration of heteroduplex and 

9 excess homoduplex nucleic acid, wherein said nucleotide 

10 variation comprises one member of a mismatched pair in 

11 said heteroduplex, wherein said excess homoduplex 

12 nucleic acids are generated by reannealing of a first 

13 test or reference nucleic acid strand with a fully 

14 complementary second test or reference nucleic acid 

15 strand; 
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16 (C) fractionating said heteroduplex from said 

17 mixture by affinity purification in which a mismatch 

18 binding protein binds to said heteroduplex; 

19 (D) recovering heteroduplex from said 

20 affinity purification to produce a heteroduplex sample 

21 which contains a second, higher concentration of said 

22 heteroduplex; and 

23 (E) detecting, as an indication of a genetic 

24 variation between said test and reference nucleic 

25 acids, the presence of a a mismatched nucleotide pair 

26 in said sample. 

1 3. A method of enriching a mixture of duplex 

2 nucleic acids for heteroduplex nucleic acid, said 

3 method comprising 

4 (A) providing a. mixture of nucleic acids 

5 comprising a first concentration of a heteroduplex 

6 comprising a test nucleic acid strand and a reference 

7 nucleic acid strand, and excess homoduplex nucleic 

8 acids, wherein said excess homoduplex nucleic acids are 

9 generated by reannealing of a first test or reference 

10 nucleic acid strand with a fully complementary second 

11 test or reference nucleic acid strand; 



12 (B) separating said heteroduplex nucleic acid 

13 from said mixture by affinity purification in which a 

14 mismatch binding protein binds to said heteroduplex 

15 nucleic acid; and 

16 (C) recovering said heteroduplex nucleic acid 

17 from said binding protein to produce a mixture that 

18 contains a second, higher concentration of said 

19 heteroduplex. 

1 4. The method of claim 3 wherein step B is 

2 conducted by forming a complex between heteroduplex and 

3 said mismatch binding protein and separating said 

4 complex from uncomplexed duplex. 
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1 5. The method of any one of claims 1 or 2 wherein 

2 said detecting step comprises detecting one of: said 

3 mismatch binding protein bound to said heteroduplex , 

4 and said heteroduplex bound to said mismatch binding 

5 protein. 

1 6. The method of claim 5 wherein said heteroduplex 

2 comprises a detectable moiety and said detecting step 

3 comprises detecting said detectable moiety. 

1 7. The method of claim 5 wherein said mismatch 

2 binding protein further comprises a detectable moiety 

3 and said detecting step comprises detecting said 

4 detectable moiety. 

1 8. The method of claim 6 wherein said moiety 

2 comprises a label, and said detecting step comprises 

3 detecting label bindable by said mismatch binding 

4 protein. 

1 9. The method of claim 7 wherein said moiety 

2 comprises a label and said detecting step comprises 

3 detecting label bindable to said heteroduplex. 

1 10. The method of claim 5 wherein said detecting 

2 step comprises forming an immune complex between one of 

3 said bound mismatch binding protein or said bound 

4 heteroduplex and an antibody. 

1 11. The method of any one of claims 1 or 2 wherein 

2 said mismatched nucleotide pair is of unknown identity 

3 or location, and further comprising the step of 

4 determining the identity or location of said mismatched 

5 pair. 



12. The method of claim 11 wherein said 
determining step comprises analyzing the nucleotide 
sequence of said test or reference nucleic acid of said 
heteroduplex. 

13. The method of claim 2 wherein said steps C and 
D are repeated prior to performing step B. 

14. The method of claim 1 wherein after step (C) 
but prior to step (D) , said method further comprises 
the additional steps of isolating said heteroduplex 
complexes and amplifying said heteroduplex comprising 
said mismatched nucleotide pair. 

15. The method of claim 14 wherein said test 
nucleic acid comprises a first PCR sequence and said 
reference nucleic acid comprises a second PCR sequence. 

16. The method of claim 2 wherein after step (D) 
but prior to step (E), said method further comprises 
the additional step of amplifying said heteroduplex 
comprising said mismatched nucleotide pair. 

17 . The method of claim 16 wherein said test 
nucleic acid comprises a first PCR sequence and said 
reference nucleic acid comprises a second PCR sequence. 

18. The method of claim 3 or 4, said method 
further comprising after step (C) the step of 
amplifying said recovered mixture. 
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1 19. The method of claim 18 wherein said test 

2 nucleic acid comprises a first PCR sequence and said 

3 reference nucleic acid comprises a second PCR sequence. 

1 20. The method of claim 14 wherein said 

2 heteroduplex further comprises PCR tails, and said 

3 amplifying step comprises performing a polymerase chain 

4 reaction. 

1 21. The method of claim 16 wherein said 

2 heteroduplex further comprises PCR tails, and said 

3 amplifying step comprises performing a polymerase chain 

4 reaction. 

1 22. The method of claim 18 wherein said 

2 heteroduplex further comprises PCR tails, and said 

3 amplifying step comprises performing a polymerase chain 

4 reaction. 

1 23. The method of claims 3 or 4 wherein the 

2 reference nucleic acid is labeled, said method further 

3 comprising the step of, prior to said separating step 

4 (B), adding excess unlabeled nucleic acid to said 

5 mixture as a competitor, thereby to reduce background. 

1 24. The method of claims 3 or 4 wherein the 

2 reference and test nucleic acids comprise PCR tails, 

3 and said method further comprises the steps of: 

4 (i) prior to said separating step, adding 

5 excess homoduplex nucleic acid lacking PCR tails; and 

6 (ii) after said recovering step, amplifying 

7 said recovered mixture, thereby to reduce background. 
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1 25. The method of claim 3 or 4 wherein said 

2 mismatch binding protein comprises a histidine tail. 

1 26. The method of claim 3 or 4 wherein said 

2 mismatch binding protein comprises a flag sequence and 

3 said solid support comprises an antibody that binds to 

4 said flag sequence. 

1 27. A kit for detecting a heteroduplex nucleic 

2 acid as an indication of genetic variation, said kit 

3 comprising: 

4 a mismatch binding protein, and 

5 means for separating a heteroduplex. 

1 28. A kit for separating a heteroduplex nucleic 

2 acid from a mixture of heteroduplex and homoduplex 

3 nucleic acids, said kit comprising: 

4 a mismatch binding protein coupled to a solid 

5 support, and 

6 means for separating said heteroduplex. 

1 29. a kit for separating a heteroduplex nucleic 

2 acid from a mixture of heteroduplex and homoduplex 

3 nucleic acids, said kit comprising: 

4 a protein capable that binds a mismatch 

5 binding protein, and 

6 means for separating said heteroduplex. 

1 30. A kit for separating a heteroduplex nucleic 

2 acid from a mixture of heteroduplex and homoduplex 

3 nucleic acids, said kit comprising: 

4 a protein that binds a complex comprising a 

5 mismatch binding protein and a heteroduplex, and 

6 means for separating said heteroduplex. 
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1 31. The kit of any one of claims 26-29 further 

2 comprising 

3 a reference nucleic acid. 

1 32. The kit of claim 30 or 31 , further comprising 

2 a mismatch binding protein. 

1 33. The kit of any one of claims 27-30 wherein 

2 said means comprises a buffer suitable for detecting or 

3 separating said heteroduplex. 

1 34. The kit of claim 27 wherein said mismatch 

2 binding protein is immobilized on a solid support. 

1 35. The kit of claim 29 or 30 wherein said protein 

2 capable of binding said mismatch binding protein is 

3 immobilized on a solid support. 

1 36. A solid support for preferentially binding 

2 heteroduplex nucleic acids , said support comprising: 

3 a mismatch binding protein coupled to a 

4 solid support. 

1 37. A solid support for preferentially binding 

2 heteroduplex nucleic acids , said support comprising: 

3 a first protein , coupled to a solid 

4 support, capable of binding a mismatch binding protein. 



1 38. The solid support of claim 36 or 37 wherein 

2 said solid support comprises an affinity matrix. 
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